Cheerio HTML Parser

automation Developer Tools

v0.5.9

Last updated Dec 24, 2025

n8n node to parse HTML using Cheerio

43 Weekly Downloads

404 Monthly Downloads

View on NPM GitHub Repository

Included Nodes

Cheerio HTML Parser

Description

n8n-nodes-cheerio-html-parser

This is a custom n8n node that uses Cheerio to parse HTML content.

Features

Parse HTML using multiple CSS selectors
Convert selected output to array or string
Remove unwanted elements (scripts, styles, navigation, etc.) before parsing
Extract specific attributes from elements

Installation

1. Clone this repository
2. Install dependencies:

npm install

3. Build the node:

npm run build

4. Link to your n8n installation:

npm link

5. In your n8n installation directory, run:

npm link n8n-nodes-cheerio-html-parser

Usage

1. Add the “Cheerio HTML Parser” node to your workflow
2. Input the HTML content you want to parse
3. Add one or more selectors with:
– Name: A unique identifier for this selector result
– CSS Selector: The CSS selector to find elements (e.g., “div.content”, “p.title”, “#main”)
– Attribute (optional): Extract a specific attribute instead of text content
– Return Single Item: Choose whether to return the first match or all matches
4. Optionally specify elements to remove before parsing (e.g., “script, style, nav, footer”)
5. Connect the node to your workflow

Example

Input HTML:


  Title
  Some text

With selector: .content h1 the node will return:

{
  "results": {
    "title": "Title"
  },
  "totalElements": 1,
  "selectors": 1
}

Complete Example

Input HTML:


  Welcome to my blog
  
    First paragraph of content
    Second paragraph of content

Node Configuration:

{
  "selectors": [
    {
      "name": "title",
      "selector": "h1.title",
      "singleItem": true
    },
    {
      "name": "paragraphs",
      "selector": "div.content p",
      "singleItem": false
    }
  ]
}

Output:

{
  "results": {
    "title": "Welcome to my blog",
    "paragraphs": [
      "First paragraph of content",
      "Second paragraph of content"
    ]
  },
  "totalElements": 3,
  "selectors": 2
}

Advanced Example with Element Removal

Input HTML:


  
    
    
  
  
    Navigation Menu
    
      Article Title
      
        Main content here
      
    
    Footer content

Node Configuration:

{
  "removeElements": "script, style, nav, footer",
  "selectors": [
    {
      "name": "title",
      "selector": "h1.title",
      "singleItem": true
    },
    {
      "name": "content",
      "selector": "div.content p",
      "singleItem": true
    },
    {
      "name": "titleClass",
      "selector": "h1.title",
      "attribute": "class",
      "singleItem": true
    }
  ]
}

Output:

{
  "results": {
    "title": "Article Title",
    "content": "Main content here",
    "titleClass": "title"
  },
  "totalElements": 3,
  "selectors": 3
}

Note: The script, style, nav, and footer elements were removed before parsing, so they don’t interfere with the content extraction.

Node Structure

The node outputs an object with the following structure:

results: An object containing the extracted data, with keys matching the selector names
totalElements: The total number of elements found across all selectors
selectors: The number of selectors that were processed

Development

To run tests:

npm test

License

MIT

Included Nodes

Description

n8n-nodes-cheerio-html-parser

Features

Installation

Usage

Example

Title

Complete Example

Welcome to my blog

Advanced Example with Element Removal

Article Title

Node Structure

Development

License

More in automation

Global Constants

Zoho Zeptomail

Datastore

Apify

Cronlytic Trigger

SerpApi Official

Perplexity

Instagram