Convert HTML file with table(s) to DataFrame

Hello,

I have an HTML file with a table and would like to convert it to a Julia DataFrame.

I was looking for a function similar to Python Pandas [`read_html`](https://github.com/pandas-dev/pandas/blob/v0.23.4/pandas/io/html.py#L826-L987) function (which directly output a list of DataFrame).

Unfortunately I don't see similar function in Julia ecosystem

In Gumbo doc I was looking for an example to iterate over rows and colums of each table

here is a basic HTML source file with 2 tables
```html
<!DOCTYPE >
<HTML>
  <head></head>
  <body>

    <h1>First table</h1>
    <table>
      <tbody>
        <tr>
          <th>
            A
          </th>
          <th>
            B
          </th>
        </tr>
        <tr>
          <td>
            1
          </td>
          <td>
            1.1
          </td>
        </tr>
        <tr>
          <td>
            2
          </td>
          <td>
            2.1
          </td>
        </tr>
      </tbody>
    </table>

    <h1>Second table</h1>
    <table>
      <tbody>
        <tr>
          <th>
            AA
          </th>
          <th>
            BB
          </th>
        </tr>
        <tr>
          <td>
            10
          </td>
          <td>
            10.1
          </td>
        </tr>
        <tr>
          <td>
            20
          </td>
          <td>
            20.1
          </td>
        </tr>
      </tbody>
    </table>

  </body>
</HTML>
```

I'm not sure if such example should be part of [Gumbo](https://github.com/JuliaWeb/Gumbo.jl/) or [Cascadia](https://github.com/Algocircle/Cascadia.jl) or even [EzXML.jl](https://github.com/bicycle1885/EzXML.jl)

Anyway none of this project show example with HTML tables... so there is probably a room for doc improvement.


Kind regards

PS : related SO post https://stackoverflow.com/questions/42915962/extracting-and-constructing-tables-from-html-files-using-julia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Convert HTML file with table(s) to DataFrame #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Convert HTML file with table(s) to DataFrame #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions