How to Install and Uninstall python3-html-text Package on Kali Linux
Last updated: December 24,2024
1. Install "python3-html-text" package
This guide let you learn how to install python3-html-text on Kali Linux
$
sudo apt update
Copied
$
sudo apt install
python3-html-text
Copied
2. Uninstall "python3-html-text" package
Here is a brief guide to show you how to uninstall python3-html-text on Kali Linux:
$
sudo apt remove
python3-html-text
Copied
$
sudo apt autoclean && sudo apt autoremove
Copied
3. Information about the python3-html-text package on Kali Linux
Package: python3-html-text
Source: html-text
Version: 0.5.2-2
Installed-Size: 38
Maintainer: Christian Marillat
Architecture: all
Depends: python3-lxml, python3:any
Size: 9224
SHA256: 79e953053b91dbf927f7a4f1c1a158cfd4876ff3dd1ee306f533b4b0a09125a5
SHA1: 18c765cb6e0529614a702a349cc9f78c0d61f96b
MD5sum: 39521816c7420bc6d103602865ecba75
Description: extract text from HTML.
How is html_text different from .xpath('//text()') from LXML or .get_text()
from Beautiful Soup ?
.
* Text extracted with html_text does not contain inline styles,
javascript, comments and other text that is not normally visible to
users;
* html_text normalizes whitespace, but in a way smarter than
.xpath('normalize-space()), adding spaces around inline elements (which
are often used as block elements in html markup), and trying to avoid
adding extra spaces for punctuation;
* html-text can add newlines (e.g. after headers or paragraphs), so that
the output text looks more like how it is rendered in browsers.
Description-md5:
Homepage: https://github.com/TeamHG-Memex/html-text
Section: python
Priority: optional
Filename: pool/main/h/html-text/python3-html-text_0.5.2-2_all.deb
Source: html-text
Version: 0.5.2-2
Installed-Size: 38
Maintainer: Christian Marillat
Architecture: all
Depends: python3-lxml, python3:any
Size: 9224
SHA256: 79e953053b91dbf927f7a4f1c1a158cfd4876ff3dd1ee306f533b4b0a09125a5
SHA1: 18c765cb6e0529614a702a349cc9f78c0d61f96b
MD5sum: 39521816c7420bc6d103602865ecba75
Description: extract text from HTML.
How is html_text different from .xpath('//text()') from LXML or .get_text()
from Beautiful Soup ?
.
* Text extracted with html_text does not contain inline styles,
javascript, comments and other text that is not normally visible to
users;
* html_text normalizes whitespace, but in a way smarter than
.xpath('normalize-space()), adding spaces around inline elements (which
are often used as block elements in html markup), and trying to avoid
adding extra spaces for punctuation;
* html-text can add newlines (e.g. after headers or paragraphs), so that
the output text looks more like how it is rendered in browsers.
Description-md5:
Homepage: https://github.com/TeamHG-Memex/html-text
Section: python
Priority: optional
Filename: pool/main/h/html-text/python3-html-text_0.5.2-2_all.deb