Cart
Free Shipping in Australia
Proud to be B-Corp

Guide to OCR for Indic Scripts Venu Govindaraju

Guide to OCR for Indic Scripts By Venu Govindaraju

Guide to OCR for Indic Scripts by Venu Govindaraju


$365,89
Condition - New
Only 2 left

Summary

This is the first comprehensive text on Optical Character Recognition for Indic scripts. It covers many topics and describes OCR systems for eight different scripts-Bangla, Devanagari, Gurmukhi, Gujarti, Kannada, Malayalam, Tamil and Urdu.

Guide to OCR for Indic Scripts Summary

Guide to OCR for Indic Scripts: Document Recognition and Retrieval by Venu Govindaraju

Theoriginalmotivationsfordevelopingopticalcharacterrecognitiontechnologies weremodesttoconvertprintedtexton?atphysicalmediatodigitalform,prod- ingmachine-readabledigitalcontent. Bydoingthis,wordsthathadbeeninertand bound to physical material would be brought into the digital realm and thus gain newandpowerfulfunctionalitiesandanalyticalpossibilities. First-generation digital OCR researchers in the 1970s quickly realized that by limiting their ambitions primarily to contemporary documents printed in st- dard font type from the modern Roman alphabet (and of these, mostly English language materials), they were constraining the possibilities for future research andtechnologiesconsiderably. Domainresearchersalsosawthatthetrajectoryof OCR technologies if left unchanged would exclude a large portion of the human record. Digitalconversionofdocumentsandmanuscriptsinotheralphabets,scripts, and cursive styles was of critical importance. Embedded in non-Roman alp- bet source documents, including ancient manuscripts, papyri scrolls, clay tablets, and other inscribed artifacts was not only a wealth of scholarly information but alsonewopportunitiesandchallengesforadvancingOCR,imagingsciences,and othercomputationalresearchareas. Thelimitingcircumstancesatthetimeincluded the rudimentary capability (and high cost) of computational resources and lack of network-accessible digital content. Since then computational technology has advancedataveryrapidpaceandnetworkinginfrastructurehasproliferated. Over time, thisexponential decrease inthecost of computation, memory, and com- nicationsbandwidthcombinedwiththeexponentialincreaseinInternet-accessible digitalcontenthastransformededucation,scholarship,andresearch. Largenumbers ofresearchers,scholars,andstudentsuseanddependuponInternet-basedcontent andcomputationalresources. Thechaptersinthisbookdescribeacriticallyimportantareaofinvestigation- addressingconversionofIndicscriptintomachine-readableform. Roughestimates haveitthatcurrentlymorethanabillionpeopleuseIndicscripts. Collectively,Indic historic and cultural documents contain a vast richness of human knowledge and experience. The state-of-the-art research described in this book demonstrates the multiple values associated with these activities. Technically, the problems associated with Indicscriptrecognitionareverydif?cultandwillcontributetoandinformrelated v vi Foreword scriptrecognitionefforts. Theworkalsohasenormousconsequenceforenriching andenablingthestudyofIndicculturalheritagematerialsandthehistoricrecord of its people. This in turn broadens the intellectual context for domain scholars focusingonothersocieties,ancientandmodern. Digital character recognition has brought about another milestone in coll- tivecommunicationbybringinginert,?xed-in-place,textintoaninteractivedi- talrealm. Indoingso,theinformationhasgainedadditionalfunctionalitieswhich expandourabilitiestoconnect,combine,contextualize,share,andcollaboratively pursue knowledge making. High-quality Internet content continues to grow in an explosivefashion. Inthenewglobalcyberenvironment,thefunctionalitiesandapp- cationsofdigitalinformationcontinuetotransformknowledgeintonewundersta- ingsofhumanexperienceandtheworldinwhichwelive. Thepossibilitiesforthe futurearelimitedonlybyavailableresearchresourcesandcapabilitiesandtheim- inationandcreativityofthosewhousethem. Arlington,Virginia StephenM.

Table of Contents

Part I: Recognition of Indic Scripts Building Data Sets for Indian Language OCR Research C. V. Jawahar, Anand Kumar, A. Phaneendra and K.J. Jinesh On OCR of major Indian scripts: Bangla and Devanagari B. B. Chaudhari A Complete Machine Printed Gurmukhi OCR System Gurpreet Singh Lehal Progress in Gujarati Document Processing and Character Recognition Jignesh Dholakia, Atul Negi and S. Rama Mohan Design of a bilingual Kannada-English OCR R. S. Umesh , P. B. Pati and A. G. Ramakrishnan Recognition of Malayalam Documents N. V. Neeba , Anoop Namboodiri, C. V. Jawahar and P. J. Narayanan A Complete OCR System for Tamil Magazine Documents K. H. Aparna and V. S. Chakravarthy Experiments on Urdu Text Recognition Omar Mukhtar, Srirangaraj Setlur and Venu Govindaraju The BBN Byblos Hindi OCR System Prem Natarajan, Ehry MacRostie, and Michael Decerbo Generalization of Hindi OCR using Adaptive Segmentation and Font Files Mudit Agrawal, Huanfeng Ma and David Doermann Online Handwriting Recognition for Indic Scripts A. Bharath and Sriganesh Madhvanath Part II: Retrieval of Indic Documents Enhancing Access to Primary Cultural Heritage Materials of India Peter M. Scharf and Malcolm Hyman Digital Image Enhancement of Indic Historical Manuscripts Zhixin Shi, Srirangaraj Setlur and Venu Govindaraju GFG based Compression and Retrieval of Document Images in Indian Scripts Gaurav Harit, Shantanu Chaudhary and Ritu Garg Word spotting for Indic documents to facilitate retrieval Anurag Bhardwaj, Srirangaraj Setlur, Venu Govindaraju Indian Language Information Retrieval Prasenjit Majumder and Mandar Mitra

Additional information

NLS9781447125181
9781447125181
1447125185
Guide to OCR for Indic Scripts: Document Recognition and Retrieval by Venu Govindaraju
New
Paperback
Springer London Ltd
2012-03-14
325
N/A
Book picture is for illustrative purposes only, actual binding, cover or edition may vary.
This is a new book - be the first to read this copy. With untouched pages and a perfect binding, your brand new copy is ready to be opened for the first time

Customer Reviews - Guide to OCR for Indic Scripts