sicos1977 / pageorientationengine Goto Github PK
View Code? Open in Web Editor NEWDetect the text orientation on a page with Tesseract OCR
Detect the text orientation on a page with Tesseract OCR
Hello Sicos,
I make the code below. My result always PageCorrect. Can I help me ?
private void button1_Click(object sender, EventArgs e)
{
TesseractTeste();
}
private void TesseractTeste()
{
string testImagePath = @"C:\Temp\1.jpg";
string myLang = "eng";
string tesseractDir = @"C:\Temp\Tesseract-OCR\tessdata";
var documentInspector = new DocumentInspector(tesseractDir, myLang);
var bmp = new Bitmap(testImagePath);
try
{
using (var engine = new TesseractEngine(tesseractDir, "eng", EngineMode.Default))
{
using (var img = Pix.LoadFromFile(testImagePath))
{
{
using (var page = engine.Process(img))
{
var a = documentInspector.DetectPageOrientation(bmp);
MessageBox.Show(a.ToString(), "");
}
}
}
}
}
catch (Exception e)
{
Trace.TraceError(e.ToString());
}
}
I've made some naive changes to this function which appear to be working for the 4x 90' rotations.
I also had to adjust the counter condition used below to get it to work at all. The original setting was <5.
if (counter < 2) continue; found = pageIterator.TryGetBoundingBox(PageIteratorLevel.TextLine, out rect); break;
` public DocumentInspectorPageOrientation DetectPageOrientation(Bitmap bitmap)
{
if (bitmap == null)
throw new NullReferenceException("The bitmap parameter is not set");
if (bitmap.PixelFormat == PixelFormat.Format1bppIndexed)
bitmap = BitmapUtils.CopyToBpp(bitmap, 8);
using (var engine = new TesseractEngine(TesseractDataPath, TesseractLanguage))
{
var rect = new Rect();
using (var image = PixConverter.ToPix(bitmap))
using (var page = engine.Process(image, PageSegMode.AutoOsd))
{
var pageIterator = page.AnalyseLayout();
pageIterator.Begin();
while (pageIterator.Next(PageIteratorLevel.Block))
{
var found = false;
while (pageIterator.Next(PageIteratorLevel.Para))
{
var counter = 0;
while (pageIterator.Next(PageIteratorLevel.TextLine, PageIteratorLevel.Word))
counter++;
if (counter < 2) continue;
found = pageIterator.TryGetBoundingBox(PageIteratorLevel.TextLine, out rect);
break;
}
var croppedRect = new Rectangle(rect.X1, rect.Y1, rect.Width, rect.Height);
if (rect.Height == 0)
return DocumentInspectorPageOrientation.Undetectable;
var croppedImage = found
? bitmap.Clone(croppedRect, bitmap.PixelFormat)
: bitmap.Clone() as Bitmap;
// The OCR confidence on the first run
float firstMeanConfedence;
// The OCR confidence on the second run
float secondMeanConfedence;
// The OCR confidence on the second run
float thirdMeanConfedence;
// The OCR confidence on the second run
float fourthMeanConfedence;
using (var engineCroppedImage = new TesseractEngine(TesseractDataPath, TesseractLanguage))
{
using (var imageNormal = PixConverter.ToPix(croppedImage))
using (var pageNormal = engineCroppedImage.Process(imageNormal))
firstMeanConfedence = pageNormal.GetMeanConfidence();
if (firstMeanConfedence > 0.75)
return DocumentInspectorPageOrientation.PageCorrect;
// Rotate image 90 degrees
croppedImage.RotateFlip(RotateFlipType.Rotate90FlipNone);
//croppedImage.Save(@"d:\\Crop area flipped.tif", System.Drawing.Imaging.ImageFormat.Tiff);
using (var imageRotated90 = PixConverter.ToPix(croppedImage))
using (var pageRotated90 = engineCroppedImage.Process(imageRotated90))
secondMeanConfedence = pageRotated90.GetMeanConfidence();
if (secondMeanConfedence > 0.75)
return DocumentInspectorPageOrientation.PageRotatedLeft;
// Rotate image 180 degrees
croppedImage.RotateFlip(RotateFlipType.Rotate90FlipNone);
//croppedImage.Save(@"d:\\Crop area flipped.tif", System.Drawing.Imaging.ImageFormat.Tiff);
using (var imageRotated180 = PixConverter.ToPix(croppedImage))
using (var pageRotated180 = engineCroppedImage.Process(imageRotated180))
thirdMeanConfedence = pageRotated180.GetMeanConfidence();
if (thirdMeanConfedence > 0.65)
return DocumentInspectorPageOrientation.PageUpsideDown;
// Rotate image 270 degrees
croppedImage.RotateFlip(RotateFlipType.Rotate90FlipNone);
//croppedImage.Save(@"d:\\Crop area flipped.tif", System.Drawing.Imaging.ImageFormat.Tiff);
using (var imageRotated270 = PixConverter.ToPix(croppedImage))
using (var pageRotated270 = engineCroppedImage.Process(imageRotated270))
fourthMeanConfedence = pageRotated270.GetMeanConfidence();
if (fourthMeanConfedence > 0.65)
return DocumentInspectorPageOrientation.PageRotatedRight;
}
croppedImage.Dispose();
float[] vals = new float[4] { firstMeanConfedence, secondMeanConfedence, thirdMeanConfedence, fourthMeanConfedence };
int hpos = -1;
float hval = 0;
for (int i = 0; i < vals.Length; i++) {
if (vals[i] > hval) { hpos = i; hval = vals[i]; }
}
if (hpos == 0) { return DocumentInspectorPageOrientation.PageCorrect; }
if (hpos == 1) { return DocumentInspectorPageOrientation.PageRotatedLeft; }
if (hpos == 2) { return DocumentInspectorPageOrientation.PageUpsideDown; }
if (hpos == 3) { return DocumentInspectorPageOrientation.PageRotatedRight; }
return DocumentInspectorPageOrientation.Undetectable;
/*
if (firstMeanConfedence > 0.40 && secondMeanConfedence > 0.40)
return firstMeanConfedence >= secondMeanConfedence
? DocumentInspectorPageOrientation.PageCorrect
: DocumentInspectorPageOrientation.PageRotatedLeft;
*/
}
}
return DocumentInspectorPageOrientation.Undetectable;
}
}`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.