Parses Web Pages using Postlight Mercury

This is a wrapper for the Mercury Parser API. The Mercury Parser is a single API endpoint that takes a URL and gives you back the content reliably and easily. With just one API request, Mercury takes any web article and returns only the relevant content — headline, author, body text, relevant images and more — free from any clutter. It’s reliable, easy-to-use and free. See the webpage here: < https://mercury.postlight.com/>.


CRAN_Status_Badge

The goal of postlightmercury is to wrap the postlight mercury web parser API for R. With just one API request, Mercury takes any web article and returns only the relevant content — headline, author, body text, relevant images and more — free from any clutter. It’s reliable, easy-to-use and free.

Installation

You can install postlightmercury from github with:

devtools::install_github("56north/postlightmercury")

Example

This is a basic example which shows you how to solve a common problem:

## basic example code
 
# First get api key here: https://mercury.postlight.com/web-parser/
 
# Then run the code below replacing the X's wih your api key.
library(postlightmercury)
 
df <- web_parser(
  page_urls = "https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed",
  api_key = XXXXXXXXXXXXXXXXXXXXXXX)

News

postlightmercury 1.1

  • Added a NEWS.md file to track changes to the package.
  • Re-wrote the webparser() so it uses the crul package instead. This allows for asynchronous requests.
  • Added the helper function null_to_na() to replace NULLs with NAs when converting lists to data frames.
  • Added codecoverage

postlightmercury 1.2

  • Added remove_html() function to clean html from content column.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("postlightmercury")

1.2 by Mikkel Freltoft Krogsholm, 2 years ago


Browse source code at https://github.com/cran/postlightmercury


Authors: Mikkel Freltoft Krogsholm


Documentation:   PDF Manual  


Task views: Web Technologies and Services


MIT + file LICENSE license


Imports tibble, crul, purrr, jsonlite, rvest, xml2

Suggests testthat, covr


See at CRAN