![jsoup clean text jsoup clean text](https://www.baeldung.com/wp-content/uploads/2021/02/Selection_202.png)
![jsoup clean text jsoup clean text](https://ayobamiadewole.com/Blog/Images/android2.gif)
The sample code is as follows: import org.
Jsoup clean text how to#
How to use the Whitelist filter to clear HTML tagsĬreate an appropriate Whitelist object and use the clean method to clean up HTML tags. On this basis, you can also expand and customize the filters. "Jsoup provides 5 default whitelist filters. The API will only retain a, b, blockquote, br, caption, cite, code, col, colgroup, dd, div, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, span, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul tag, all other HTML tags will be cleared, and rel will not be forced to be added to the hyperlink =nofollow attribute. The API is reserved basic ( )The tags that are allowed in are also allowed to appear in pictures (img tag )Related to the appropriate attributes of img, and its src allows it to specify http or https.
![jsoup clean text jsoup clean text](https://simplesolution.dev/images/java-convert-html-into-plain-text-using-jsoup.png)
In addition, the hyperlinks allowed in the API can be allowed to specify http, https, ftp, mailto and mandatory append rel in the hyperlink =nofollow attribute. The API will retain a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, span, strike, strong, sub, sup, u, ul and its appropriate attribute tags, all other HTML tags will be cleared, and the API does not allow images (img tag ). The API will only remain b, em, i, strong, u tag, all other HTML tags will be cleared. This API will clear all HTML tags and only keep text nodes. The specific introduction is as follows: 1 ): none ( ) The default Jsoup provides 5 kinds of Whitelist () API, This method will clear all HTML tags in the whitelist you specified. static String clean (String strHTML, Whitelist whitelist ) Use Jsoup's clean method to clean up HTML tags (this method is located under JsoupAPI: ). This article will mainly introduce the use of Jsoup to eliminate untrusted HTML to prevent XSS attacks 1. Use third-party library Jsoup /AntiXSS etc. Blacklist through the dom object /Whitelist filtering There are roughly several personal summary of strategies to prevent XSS attacks: - Use regular settings to whitelist /Blacklist filtering Transfer from: Use Jsoup to eliminate untrusted HTML (to prevent XSS attacks)