🧩 Node.js Script Demo
- GitHub Repository https://github.com/CafeScraper/NodeScirptDemo
Required Files (Located in the Project Root Directory)
| File Name | Description |
|---|---|
main.js | Script entry file (execution entry point), must be named main |
package.json | Node.js dependency management file |
input_schema.json | UI input form configuration |
README.md | Project documentation |
sdk.js | SDK core functionality |
sdk_pd.js | Data processing enhancement module |
sdk_pd_grpc.js | Network communication module |
⭐ Core SDK Files
📁 File Overview
The following three SDK files are required and must be placed in the root directory of the script:| File Name | Primary Function |
|---|---|
sdk.js | Core functionality module |
sdk_pd.js | Data processing enhancement module |
sdk_pd_grpc.js | Network communication module |
🔧 Core Feature Usage Guide
1. Environment Parameters – Retrieve Script Startup Configuration
When the script starts, configuration parameters (such as target website URLs or search keywords) can be passed in externally. Use the following method to retrieve them:If you need to crawl different websites for different tasks, you can simply pass different parameters without modifying the script code.
2. Runtime Logs – Track Script Execution
During execution, you can record logs at different levels. These logs will be displayed in the platform console, making it easy to monitor execution status and troubleshoot issues:- debug: Detailed debugging information, recommended during development
- info: Normal execution flow logs
- warn: Warnings indicating potential issues that do not stop execution
- error: Errors that require attention
3. Result Submission – Sending Collected Data Back to the Platform
After data collection, results must be returned to the platform in two steps.Step 1: Define Table Headers (Required)
Before pushing any data, you need to define the table structure—similar to defining column headers in Excel:- label: Column name displayed to users (user-visible)
- key: Unique identifier used in code
- format: Data type, supported values:
"text"– string"integer"– integer"boolean"– boolean"array"– list/array"object"– object/dictionary
Step 2: Push Data Row by Row
After setting the table headers, push collected data one record at a time:- The order of setting table headers and pushing data does not matter
- Keys in the pushed data must exactly match the keys defined in the table headers
- Data must be pushed one record at a time
- It is recommended to log after each push for easier tracking
⚠️ Common Issues & Notes
- File location: Ensure all three SDK files are placed in the script’s root directory
- Imports: You can directly use
SDKorCafeSDKin your code - Key consistency: Data keys must exactly match table header keys (case-sensitive)
- Error handling: Always check SDK call results, especially when pushing data
⭐ Script Entry File (main.js)
💡 Example Code
Automated Data Collection Script: Workflow & Principles
1. Script Overview
This is a Script for an automation tool that acts like a digital worker.It automatically opens specified web pages (such as social media sites), extracts the required information, and organizes it into structured tables.
2. How Does It Work?
The entire process can be simplified into four main stages:Step 1: Receive Instructions (Input Parameters)
Before execution, you provide instructions such as:- Target page URL
- Number of records to collect
Step 2: Stealth Preparation (Proxy Network Configuration)
To reliably access overseas or restricted websites, the script automatically configures a secure proxy channel.Step 3: Automated Execution (Business Logic Processing)
This is the core stage where the script:- Visits target pages
- Extracts titles, content, images, and other required data
Step 4: Result Reporting (Data Push & Table Generation)
After collection:- Raw data is converted into standardized formats
- Results are saved to the system
- Table headers (e.g., “URL”, “Content”) are automatically generated