Ferramentas para capturar e converter a Web

Capturar tabelas HTML de sites com PHPAPI PHP

Existem v√°rias maneiras de converter tabelas HTML into JSON, CSV ou Excel usando planilhas API PHP do GrabzIt, detalhadas aqui est√£o algumas das t√©cnicas mais √ļteis. No entanto, antes de come√ßar, lembre-se de que depois de ligar para o URLToTable, HTMLToTable or FileToTable m√©todos os Save or SaveTo O m√©todo deve ser chamado para capturar a tabela. Se voc√™ quiser ver rapidamente se este servi√ßo √© adequado para voc√™, tente uma demonstra√ß√£o ao vivo da captura de tabelas HTML de um URL.

Op√ß√Ķes B√°sicas

O exemplo de código encontrado abaixo converte automaticamente a primeira tabela HTML descoberta em uma página da web especificada into um documento CSV.

$grabzIt->URLToTable("https://www.tesla.com");
//Then call the Save or SaveTo method
$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>");
//Then call the Save or SaveTo method
$grabzIt->FileToTable("tables.html");
//Then call the Save or SaveTo method

Por padrão, isso converterá a primeira tabela que identifica intuma mesa. No entanto, a segunda tabela em uma página da web pode ser convertida passando um 2 para o setTableNumberToInclude método.

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");

Você também pode usar o setTargetElement método para garantir que apenas tabelas dentro do ID do elemento especificado sejam convertidas.

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->HTMLToTable("<html><body><table id='stocks_table'><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");

Como alternativa, você pode capturar todas as tabelas em uma página da Web passando true para o setIncludeAllTables No entanto, isso funcionará apenas com os formatos XLSX e JSON. Essa opção colocará cada tabela em uma nova planilha na pasta de trabalho da planilha gerada.

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");

Converter tabelas HTML em JSON

√Äs vezes, √© necess√°rio ler tabelas HTML programaticamente. O GrabzIt permite fazer isso usando PHP, convertendo tabelas HTML online into JSON. Para fazer isso, especifique json como o par√Ęmetro format. Por exemplo, no exemplo abaixo, estamos convertendo uma tabela HTML sincronicamente ent√£o usando o embutido json_decode M√©todo PHP para analisar o JSON string intum objeto com o qual possamos trabalhar facilmente.

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->URLToTable("https://www.tesla.com", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
    
$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->FileToTable("tables.html", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}

Identificador Personalizado

Você pode passar um identificador personalizado para o mesa Como mostrado abaixo, esse valor é retornado ao seu manipulador GrabzIt PHP. Por exemplo, esse identificador personalizado pode ser um identificador de banco de dados, permitindo que uma tabela extraída seja associada a um registro específico do banco de dados.

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->FileToTable("example.html", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");