wordpress首頁(yè)制作免費(fèi)下載優(yōu)化大師
1 前言
🔥 優(yōu)質(zhì)競(jìng)賽項(xiàng)目系列,今天要分享的是
基于深度學(xué)習(xí)的銀行卡識(shí)別算法設(shè)計(jì)
該項(xiàng)目較為新穎,適合作為競(jìng)賽課題方向,學(xué)長(zhǎng)非常推薦!
🧿 更多資料, 項(xiàng)目分享:
https://gitee.com/dancheng-senior/postgraduate
2 算法設(shè)計(jì)流程
銀行卡卡號(hào)識(shí)別技術(shù)原理是先對(duì)銀行卡圖像定位,保障獲取圖像絕對(duì)位置后,對(duì)圖像進(jìn)行字符分割,然后將分割完成的信息與模型進(jìn)行比較,從而匹配出與其最相似的數(shù)字。主要流程圖如圖
1.銀行卡號(hào)圖像
由于銀行卡卡號(hào)信息涉及個(gè)人隱私,作者很難在短時(shí)間內(nèi)獲取大量的銀行卡進(jìn)行測(cè)試和試驗(yàn),本文即采用作者個(gè)人及模擬銀行卡進(jìn)行卡號(hào)識(shí)別測(cè)試。
2.圖像預(yù)處理
圖像預(yù)處理是在獲取圖像后必須優(yōu)先進(jìn)行的技術(shù)性處理工作,先對(duì)銀行卡卡號(hào)圖像進(jìn)行色彩處理,具體做法與流程是先將圖像灰度化,去掉圖像識(shí)別上無(wú)用的信息,然后利用歸一化只保留有效的卡號(hào)信息區(qū)域。
3.字符分割
字符分割是在對(duì)圖像進(jìn)行預(yù)處理后,在獲取有效圖像后對(duì)有效區(qū)域進(jìn)行進(jìn)一步細(xì)化處理,將圖像分割為最小識(shí)別字符單元。
4.字符識(shí)別
字符識(shí)別是在對(duì)銀行卡卡號(hào)進(jìn)行字符分割后,利用圖像識(shí)別技術(shù)來(lái)對(duì)字符進(jìn)行分析和匹配,本文作者利用的模板匹配方法。
2.1 顏色空間轉(zhuǎn)換
由于銀行卡卡號(hào)識(shí)別與顏色無(wú)關(guān),所以銀行卡顏色是一個(gè)無(wú)用因素,我們?cè)趫D像預(yù)處理環(huán)節(jié)要先將其過(guò)濾掉。另外,圖像處理中還含有顏色信息,不僅會(huì)造成空間浪費(fèi),增加運(yùn)算量,降低系統(tǒng)的整體效率,還會(huì)給以后的圖像分析和處理帶來(lái)干擾。因此,有必要利用灰度處理來(lái)濾除顏色信息。
灰度處理的實(shí)質(zhì)是將顏色信息轉(zhuǎn)化為亮度信息,即將原始的三維顏色信息還原為一維亮度信息?;叶然乃枷胧怯没叶戎礸來(lái)表示原始彩色圖像的R(綠色)、g(紅色)和B(藍(lán)色)分量的值,具體的流程設(shè)計(jì)如圖
2.2 邊緣切割
對(duì)于采集到的銀行卡號(hào)圖像,由于背景圖案的多樣性和卡號(hào)字體的不同,無(wú)法直接對(duì)卡號(hào)圖像進(jìn)行分割。分割前要準(zhǔn)確定位卡號(hào),才能得到有效區(qū)域。數(shù)字字符所在的區(qū)域有許多像素。根據(jù)該特征,通過(guò)設(shè)置閾值來(lái)確定原始圖像中卡號(hào)圖像的區(qū)域。銀行卡圖像的切邊處理設(shè)計(jì)如圖
2.3 模板匹配
模板匹配是一種將需要識(shí)別的字符與已有固定模板進(jìn)行匹配的算法技術(shù),該技術(shù)是將已經(jīng)切割好的字符圖像逐個(gè)與模板數(shù)字圖像進(jìn)行對(duì)比分析,其原理就是通過(guò)數(shù)字相似度來(lái)衡量?jī)蓚€(gè)字符元素,將目標(biāo)字符元素逐個(gè)與模板數(shù)字圖像進(jìn)行匹配,找到最接近的數(shù)字元素即可。匹配計(jì)算量隨特征級(jí)別的增加而減少。根據(jù)第一步得到的特征,選擇第二種相關(guān)計(jì)算方法來(lái)解決圖像匹配問(wèn)題。銀行卡模板匹配流程設(shè)計(jì)如圖
2.4 卡號(hào)識(shí)別
銀行卡卡號(hào)識(shí)別有其獨(dú)有的特性,因?yàn)槟壳笆忻嫔洗蠖鄶?shù)銀行卡卡號(hào)是凹凸不平的數(shù)字形式,如果使用傳統(tǒng)的計(jì)算機(jī)字符識(shí)別技術(shù)已顯然不適用,本文針對(duì)銀行卡此類特點(diǎn),研究了解決此類問(wèn)題的識(shí)別方案。從銀行卡待識(shí)別的凸凹字符進(jìn)行預(yù)處理,然后根據(jù)滑塊算法逐個(gè)窗口對(duì)銀行卡字符進(jìn)行匹配識(shí)別,卡號(hào)識(shí)別一般從切割后的圖像最左端開始,設(shè)定截圖選定框大小為64*48像素,因?yàn)殂y行卡所需要識(shí)別的字符一般為45像素左右。故而以此方式循環(huán)對(duì)卡片上所有數(shù)字進(jìn)行匹配、識(shí)別,如果最小值大于設(shè)置的閾值,我們將認(rèn)為這里沒(méi)有字符,這是一個(gè)空白區(qū)域,并且不輸出字符。同時(shí),窗口位置J向下滑動(dòng),輸出f<19&&j;+20<圖像總長(zhǎng)度并判斷,最后循環(huán)得到字符數(shù)f、j。
3 銀行卡字符定位 - 算法實(shí)現(xiàn)
首先就是將整張銀行卡號(hào)里面的銀行卡號(hào)部分進(jìn)行識(shí)別,且分出來(lái),這一個(gè)環(huán)節(jié)學(xué)長(zhǎng)用的技術(shù)就是faster-rcnn的方法
將目標(biāo)識(shí)別部分的銀行卡號(hào)部門且分出來(lái),進(jìn)行保存
主程序的代碼如下(非完整代碼):
?
#!/usr/bin/env pythonfrom __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport argparseimport osimport cv2import matplotlib.pyplot as pltimport numpy as npimport tensorflow as tffrom lib.config import config as cfgfrom lib.utils.nms_wrapper import nmsfrom lib.utils.test import im_detectfrom lib.nets.vgg16 import vgg16from lib.utils.timer import Timeros.environ["CUDA_VISIBLE_DEVICES"] = '0' #指定第一塊GPU可用config = tf.ConfigProto()config.gpu_options.per_process_gpu_memory_fraction = 0.8 # 程序最多只能占用指定gpu50%的顯存config.gpu_options.allow_growth = True #程序按需申請(qǐng)內(nèi)存sess = tf.Session(config = config)CLASSES = ('__background__','lb')NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',), 'res101': ('res101_faster_rcnn_iter_110000.ckpt',)}DATASETS = {'pascal_voc': ('voc_2007_trainval',), 'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',)}def vis_detections(im, class_name, dets, thresh=0.5):"""Draw detected bounding boxes."""inds = np.where(dets[:, -1] >= thresh)[0]if len(inds) == 0:returnim = im[:, :, (2, 1, 0)]fig, ax = plt.subplots(figsize=(12, 12))ax.imshow(im, aspect='equal')sco=[]for i in inds:score = dets[i, -1]sco.append(score)maxscore=max(sco)# print(maxscore)成績(jī)最大值for i in inds:# print(i)score = dets[i, -1]if score==maxscore:bbox = dets[i, :4]# print(bbox)#目標(biāo)框的4個(gè)坐標(biāo)img = cv2.imread("data/demo/"+filename)# img = cv2.imread('data/demo/000002.jpg')sp=img.shapewidth = sp[1]if bbox[0]>20 and bbox[2]+20<width:cropped = img[int(bbox[1]):int(bbox[3]), int(bbox[0]-20):int(bbox[2])+20] # 裁剪坐標(biāo)為[y0:y1, x0:x1]if bbox[0]<20 and bbox[2]+20<width:cropped = img[int(bbox[1]):int(bbox[3]), int(bbox[0]):int(bbox[2])+20] # 裁剪坐標(biāo)為[y0:y1, x0:x1]if bbox[0] > 20 and bbox[2] + 20 > width:cropped = img[int(bbox[1]):int(bbox[3]), int(bbox[0] - 20):int(bbox[2])] # 裁剪坐標(biāo)為[y0:y1, x0:x1]path = 'cut1/'# 重定義圖片的大小res = cv2.resize(cropped, (1000, 100), interpolation=cv2.INTER_CUBIC) # dsize=(2*width,2*height)cv2.imwrite(path+str(i)+filename, res)ax.add_patch(plt.Rectangle((bbox[0], bbox[1]),bbox[2] - bbox[0],bbox[3] - bbox[1], fill=False,edgecolor='red', linewidth=3.5))ax.text(bbox[0], bbox[1] - 2,'{:s} {:.3f}'.format(class_name, score),bbox=dict(facecolor='blue', alpha=0.5),fontsize=14, color='white')ax.set_title(('{} detections with ''p({} | box) >= {:.1f}').format(class_name, class_name,thresh),fontsize=14)plt.axis('off')plt.tight_layout()plt.draw()def demo(sess, net, image_name):"""Detect object classes in an image using pre-computed object proposals."""# Load the demo imageim_file = os.path.join(cfg.FLAGS2["data_dir"], 'demo', image_name)im = cv2.imread(im_file)# Detect all object classes and regress object boundstimer = Timer()timer.tic()scores, boxes = im_detect(sess, net, im)timer.toc()print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))# Visualize detections for each classCONF_THRESH = 0.1NMS_THRESH = 0.1for cls_ind, cls in enumerate(CLASSES[1:]):cls_ind += 1 # because we skipped backgroundcls_boxes = boxes[:, 4 * cls_ind:4 * (cls_ind + 1)]cls_scores = scores[:, cls_ind]# print(cls_scores)#一個(gè)300個(gè)數(shù)的數(shù)組#np.newaxis增加維度 np.hstack將數(shù)組拼接在一起dets = np.hstack((cls_boxes,cls_scores[:, np.newaxis])).astype(np.float32)keep = nms(dets, NMS_THRESH)dets = dets[keep, :]vis_detections(im, cls, dets, thresh=CONF_THRESH)def parse_args():"""Parse input arguments."""parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',choices=NETS.keys(), default='vgg16')parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',choices=DATASETS.keys(), default='pascal_voc')args = parser.parse_args()return argsif __name__ == '__main__':args = parse_args()# model pathdemonet = args.demo_netdataset = args.dataset#tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default', NETS[demonet][0])tfmodel = r'./default/voc_2007_trainval/cut1/vgg16_faster_rcnn_iter_8000.ckpt'# 路徑異常提醒if not os.path.isfile(tfmodel + '.meta'):print(tfmodel)raise IOError(('{:s} not found.\nDid you download the proper networks from ''our server and place them properly?').format(tfmodel + '.meta'))# set configtfconfig = tf.ConfigProto(allow_soft_placement=True)tfconfig.gpu_options.allow_growth = True# init sessionsess = tf.Session(config=tfconfig)# load networkif demonet == 'vgg16':net = vgg16(batch_size=1)# elif demonet == 'res101':# net = resnetv1(batch_size=1, num_layers=101)else:raise NotImplementedErrornet.create_architecture(sess, "TEST", 2,tag='default', anchor_scales=[8, 16, 32])saver = tf.train.Saver()saver.restore(sess, tfmodel)print('Loaded network {:s}'.format(tfmodel))# # 文件夾下所有圖片進(jìn)行識(shí)別# for filename in os.listdir(r'data/demo/'):# im_names = [filename]# for im_name in im_names:# print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')# print('Demo for data/demo/{}'.format(im_name))# demo(sess, net, im_name)## plt.show()# 單一圖片進(jìn)行識(shí)別filename = '0001.jpg'im_names = [filename]for im_name in im_names:print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')print('Demo for data/demo/{}'.format(im_name))demo(sess, net, im_name)plt.show()
效果如下:
4 字符分割
將切分出來(lái)的圖片進(jìn)行保存,然后就是將其進(jìn)行切分:
主程序的代碼和上面第一步的步驟原理是相同的,不同的就是訓(xùn)練集的不同設(shè)置
效果圖如下:
5 銀行卡數(shù)字識(shí)別
僅部分代碼:
?
import osimport tensorflow as tffrom PIL import Imagefrom nets2 import nets_factoryimport numpy as npimport matplotlib.pyplot as plt# 不同字符數(shù)量CHAR_SET_LEN = 10# 圖片高度IMAGE_HEIGHT = 60# 圖片寬度IMAGE_WIDTH = 160# 批次BATCH_SIZE = 1# tfrecord文件存放路徑TFRECORD_FILE = r"C:\workspace\Python\Bank_Card_OCR\demo\test_result\tfrecords/1.tfrecords"# placeholderx = tf.placeholder(tf.float32, [None, 224, 224])os.environ["CUDA_VISIBLE_DEVICES"] = '0' #指定第一塊GPU可用config = tf.ConfigProto()config.gpu_options.per_process_gpu_memory_fraction = 0.5 # 程序最多只能占用指定gpu50%的顯存config.gpu_options.allow_growth = True #程序按需申請(qǐng)內(nèi)存sess = tf.Session(config = config)# 從tfrecord讀出數(shù)據(jù)def read_and_decode(filename):# 根據(jù)文件名生成一個(gè)隊(duì)列filename_queue = tf.train.string_input_producer([filename])reader = tf.TFRecordReader()# 返回文件名和文件_, serialized_example = reader.read(filename_queue)features = tf.parse_single_example(serialized_example,features={'image' : tf.FixedLenFeature([], tf.string),'label0': tf.FixedLenFeature([], tf.int64),})# 獲取圖片數(shù)據(jù)image = tf.decode_raw(features['image'], tf.uint8)# 沒(méi)有經(jīng)過(guò)預(yù)處理的灰度圖image_raw = tf.reshape(image, [224, 224])# tf.train.shuffle_batch必須確定shapeimage = tf.reshape(image, [224, 224])# 圖片預(yù)處理image = tf.cast(image, tf.float32) / 255.0image = tf.subtract(image, 0.5)image = tf.multiply(image, 2.0)# 獲取labellabel0 = tf.cast(features['label0'], tf.int32)return image, image_raw, label0# 獲取圖片數(shù)據(jù)和標(biāo)簽image, image_raw, label0 = read_and_decode(TFRECORD_FILE)# 使用shuffle_batch可以隨機(jī)打亂image_batch, image_raw_batch, label_batch0 = tf.train.shuffle_batch([image, image_raw, label0], batch_size=BATCH_SIZE,capacity=50000, min_after_dequeue=10000, num_threads=1)# 定義網(wǎng)絡(luò)結(jié)構(gòu)train_network_fn = nets_factory.get_network_fn('alexnet_v2',num_classes=CHAR_SET_LEN * 1,weight_decay=0.0005,is_training=False)with tf.Session() as sess:# inputs: a tensor of size [batch_size, height, width, channels]X = tf.reshape(x, [BATCH_SIZE, 224, 224, 1])# 數(shù)據(jù)輸入網(wǎng)絡(luò)得到輸出值logits, end_points = train_network_fn(X)# 預(yù)測(cè)值logits0 = tf.slice(logits, [0, 0], [-1, 10])predict0 = tf.argmax(logits0, 1)# 初始化sess.run(tf.global_variables_initializer())# 載入訓(xùn)練好的模型saver = tf.train.Saver()saver.restore(sess, '../Cmodels/model/crack_captcha1.model-6000')# saver.restore(sess, '../1/crack_captcha1.model-2500')# 創(chuàng)建一個(gè)協(xié)調(diào)器,管理線程coord = tf.train.Coordinator()# 啟動(dòng)QueueRunner, 此時(shí)文件名隊(duì)列已經(jīng)進(jìn)隊(duì)threads = tf.train.start_queue_runners(sess=sess, coord=coord)for i in range(6):# 獲取一個(gè)批次的數(shù)據(jù)和標(biāo)簽b_image, b_image_raw, b_label0 = sess.run([image_batch,image_raw_batch,label_batch0])# 顯示圖片img = Image.fromarray(b_image_raw[0], 'L')plt.imshow(img)plt.axis('off')plt.show()# 打印標(biāo)簽print('label:', b_label0)# 預(yù)測(cè)label0 = sess.run([predict0], feed_dict={x: b_image})# 打印預(yù)測(cè)值print('predict:', label0[0])# 通知其他線程關(guān)閉coord.request_stop()# 其他所有線程關(guān)閉之后,這一函數(shù)才能返回coord.join(threads)
最終實(shí)現(xiàn)效果:
最后
🧿 更多資料, 項(xiàng)目分享:
https://gitee.com/dancheng-senior/postgraduate