It is my belief that extremely broken html will cause extreme problems.
What the browser does, and how a google bot reads a page are totally different. Google uses Python, so, we'll assume that they use Python for their bot. Python has an sgml parser which takes your page, dissects it into a tree structure, then goes to work on that.
Things like
Code:
<a href="page.html>hi there</a>
Will sometimes properly be rendered in browsers. Hanging table cells, improperly nested cells, etc -- all go towards making an automated process have problems. It used to be that you didn't need to close the <td>, <tr>, <b>, etc, however, those all break automatic parsing.
While I don't follow every recommendation that the validators give -- I do make sure that the html isn't broken. There are pages I have that have incorrect html, bgimage, bgcolor, etc as attributes on html that isn't in the standard, but, I'll let that slide. That won't break a parser.
However, improperly nested content can sometimes cause problems.
Code:
<a href="page.html"><h1>hi there</a></h1>
An automated process will get confused with the above. Depending on how they are parsing, I would suspect you might lose the effect of the <h1>. Now, google probably goes to all lengths to make sure they can spider the web to the best of their ability, but, why gamble on that?