Skip to content

Convert HTML file with table(s) to DataFrame #71

@s-celles

Description

@s-celles

Hello,

I have an HTML file with a table and would like to convert it to a Julia DataFrame.

I was looking for a function similar to Python Pandas read_html function (which directly output a list of DataFrame).

Unfortunately I don't see similar function in Julia ecosystem

In Gumbo doc I was looking for an example to iterate over rows and colums of each table

here is a basic HTML source file with 2 tables

<!DOCTYPE >
<HTML>
  <head></head>
  <body>

    <h1>First table</h1>
    <table>
      <tbody>
        <tr>
          <th>
            A
          </th>
          <th>
            B
          </th>
        </tr>
        <tr>
          <td>
            1
          </td>
          <td>
            1.1
          </td>
        </tr>
        <tr>
          <td>
            2
          </td>
          <td>
            2.1
          </td>
        </tr>
      </tbody>
    </table>

    <h1>Second table</h1>
    <table>
      <tbody>
        <tr>
          <th>
            AA
          </th>
          <th>
            BB
          </th>
        </tr>
        <tr>
          <td>
            10
          </td>
          <td>
            10.1
          </td>
        </tr>
        <tr>
          <td>
            20
          </td>
          <td>
            20.1
          </td>
        </tr>
      </tbody>
    </table>

  </body>
</HTML>

I'm not sure if such example should be part of Gumbo or Cascadia or even EzXML.jl

Anyway none of this project show example with HTML tables... so there is probably a room for doc improvement.

Kind regards

PS : related SO post https://stackoverflow.com/questions/42915962/extracting-and-constructing-tables-from-html-files-using-julia

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions