You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After configuring siyuan's OCR, I felt that the recognition rate was low. Later, then I switched to software that invoke the paddleOCR API and found that both English and Chinese had better recognition rates. I hope siyuan can replace the original OCR engine.
Describe the optimal solution
PaddleOCR has better text recognition capabilities than Tesseract.
Quote:
Recently PaddleOCR updated the v3 version, and the English space problem has been significantly improved. I tried the English model, it works very well.
In document scenarios, PaddleOCR can achieve 95%+ accuracy. But Tesseract may be confused on some rhythmic characters.
In particular, PaddleOCR's performance in some non-Latin languages is beyond my imagination. For example Arabic, the effect is far better than EasyOCR and Tesseract
Highly recommend PaddleOCR!!!
Paddle OCR is a deep learning-based OCR system created by PaddlePaddle, a Chinese AI firm. Paddle OCR is built on the PaddlePaddle framework, which is well-known for its quick and efficient deep learning algorithms. Paddle OCR supports numerous languages, including Chinese, English, Japanese, and Korean, and can properly detect different text styles and fonts. Advantages: High accuracy: Paddle OCR has achieved state-of-the-art performance on various OCR benchmarks, including the ICDAR 2015 and ICDAR 2017 competitions.Fast and efficient: Paddle OCR is optimized for speed and can process large volumes of images in real-time, making it suitable for applications that require high throughput.Easy to use: Paddle OCR has a user-friendly interface that allows users to quickly train and deploy OCR models.
In what scenarios do you need this feature?
After configuring siyuan's OCR, I felt that the recognition rate was low. Later, then I switched to software that invoke the paddleOCR API and found that both English and Chinese had better recognition rates. I hope siyuan can replace the original OCR engine.
Describe the optimal solution
PaddleOCR has better text recognition capabilities than Tesseract.
Quote:
In document scenarios, PaddleOCR can achieve 95%+ accuracy. But Tesseract may be confused on some rhythmic characters.
In particular, PaddleOCR's performance in some non-Latin languages is beyond my imagination. For example Arabic, the effect is far better than EasyOCR and Tesseract
Highly recommend PaddleOCR!!!
Reference:
Describe the candidate solution
pls
Other information
.
The text was updated successfully, but these errors were encountered: