Skip to content

PDF files longer than 1024 pages are not included in asset file content searching #9053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
88250 opened this issue Aug 27, 2023 · 2 comments
Closed
Assignees
Milestone

Comments

@88250
Copy link
Member

88250 commented Aug 27, 2023

Performance has been improved in #9051, but it is still very slow for too large PDF files (and the memory usage is too high), so it can only be excluded from parsing.

PDF with 13744 pages using 12 workers took 17m4.761218s
@88250 88250 added this to the 2.10.2 milestone Aug 27, 2023
@88250 88250 self-assigned this Aug 27, 2023
@zxhd863943427
Copy link
Contributor

我认为这个最好设置为可设置项,因为确实存在哪怕需要长时间索引也能接受的需求。

@88250
Copy link
Member Author

88250 commented Aug 27, 2023

@zxhd863943427 后面看有需求再考虑。

88250 added a commit that referenced this issue Aug 27, 2023

Unverified

This user has not yet uploaded their public signing key.
…tent searching #9053
88250 added a commit that referenced this issue Aug 27, 2023

Unverified

This user has not yet uploaded their public signing key.
…tent searching #9053
@88250 88250 closed this as completed Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants