BeautifulSoup Html Parser and Encoding

soup = BeautifulSoup(content)

You can switch parser.

soup = BeautifulSoup(content, "html.parser")


pip install lxml
soup = BeautifulSoup(content, "lxml")

NOTE: sometimes lxml fail to find some html elements where I have to fall back to html.parser.

You can specifiy the encoding of the html content as well. On some not common cases I have to specify encoding else unicode are not outputted correctly.

soup = BeautifulSoup(content, "html.parser", from_encoding="utf-8")
r = requests.get("")encoding = r.encoding if "charset" in r.headers.get("content-type", "").lower() else Nonesoup = BeautifulSoup(r.content, from_encoding=encoding)


