-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pegdown processor hangs when data to be parsed is html #207
Comments
@sunitapatro, I suggest you dump the |
Thanks for responding. Here is the 'data' input to pegDownProcessor.markdownToHtml(data) |
@sunitapatro, what I meant by Add code to log the received data from the get stream, before passing of it to pegdown, so that the cause of the hang can be debugged. You should probably add code at the same point that will detect that the data coming back is not markdown but HTML and present it as is, without pegdown processing, so that when this happens there is some feedback to the user. |
The data that is being returned by the getStreamAsString() is nothing but content of "urldata.txt" which i shared earlier. I just saved it as .txt just to share here. In short, the urldata.txt (contains HTML) is the content input to PegDownProcessor. I understand that its wrong content to PegDownProcessor, but then i was expecting PegDownProcessor to return either some exception or null or something like that. But reality is its hanging. |
Validating input is really limited to markdown and the handled HTML tags. Handling unadulterated HTML response from a server is outside of its intended application. I do agree that it should not hang, but without having a file which causes the hang I can't being to figure out what causes it. It is up to the implementation specific code to make sure that what is fed into pegdown can at least be considered as markdown. |
Please read the file urldata.txt content to Stream, convert to String and pass to PegDownProcessor and you will be able to reproduce PegDownProcessor hang. |
@vsch |
@sunitapatro, to save time, I opened the file in pegdown using my IntelliJ IDEA plugin (idea-multimarkdown) which uses pegdown as the parser, by renaming it to urldata.md and opening it in IDEA. I saw no issues and no hangs. The problem occurs when you read the file as a stream and convert to string, it is not a pegdown issue but in the code before the pegdown call. As a test I suggest before passing the string data to pegdown, convert it to a char[] which is what pegdown does via string.toCharArray() and then dump the char array as bytes to a file and examine what contents you are really passing to pegdown. The file you provided does not cause pegdown to hang, so the issue is somewhere else. |
I find this problem too. The version is 1.6.0 final PegDownProcessor pegDownProcessor = new PegDownProcessor(Extensions.ALL_OPTIONALS | Extensions.ALL_WITH_OPTIONALS, 5000);
//markdownText is pure html
final RootNode node = pegDownProcessor.parseMarkdown(markdownText.toCharArray()); code like above, and it will stop the function exactly at parseMarkdown function |
I solve the problem by add extra tags like “< html >< body >”+data+"</ body ></ html >" |
I am trying to read from a URL for which i do not have access, so its redirecting to login page. So the data input to pegDownProcessor.markdownToHtml(data) is actually HTML.
I was expecting either null or parsing exception but it hangs at markdownToHtml(data).
Here is my code:
Any help to deal this is appreciated.
Thanks!!!
The text was updated successfully, but these errors were encountered: