Skip to main content

🧩 Go Script Demo


Required Files (Located in the Project Root Directory)

├── main.go
├── main
├── input_schema.json
├── README.md
├── go.mod
├── go.sum
└── GoSdk
      ├── sdk.go
      ├── sdk_pd.go
      └── sdk_grpc_pd.go
File NameDescription
main.goScript source code
mainScript entry executable (execution entry point), must be named main
input_schema.jsonUI input form configuration
README.mdProject documentation
sdk.goCore SDK functionality (located in GoSdk directory)
sdk_pd.goData processing enhancement module (located in GoSdk directory)
sdk_grpc_pd.goNetwork communication module (located in GoSdk directory)

Go Scripts Must Be Built into an Executable Before Uploading

set CGO_ENABLED=0
set GOOS=linux
set GOARCH=amd64
go build -o main ./main.go
💡 It is recommended to compress the executable using UPX after building to significantly reduce file size.

⭐ Core SDK Files

📁 File Overview

The following three SDK files are required and must be placed in the root directory of the script:
File NameMain Function
sdk.goCore SDK functionality
sdk_pd.goData processing enhancement module
sdk_grpc_pd.goNetwork communication module
These files together form the script’s toolbox, providing all essential capabilities required for crawler execution and communication with the platform backend.

🔧 Core Feature Usage Guide

1. Environment Parameters – Retrieve Script Input Configuration

When the script starts, configuration parameters (such as target website URLs or search keywords) can be passed in externally. Use the following method to retrieve them:
// Retrieve all input parameters as a JSON string
ctx := context.Background()
inputJSON, _ := cafesdk.Parameter.GetInputJSONString(ctx)

// Example return value:
// {"website": "example.com", "keyword": "technology news"}
Use case:
You can reuse the same script to crawl different websites or datasets simply by changing input parameters, without modifying the code.

2. Runtime Logs – Record Script Execution Process

During execution, logs of different levels can be recorded and displayed in the platform console for monitoring and debugging:
ctx := context.Background()

// Debug information (most detailed, for troubleshooting)
SDK.Log.Debug(ctx, "Connecting to target website...")

// Informational logs (normal workflow)
SDK.Log.Info(ctx, "Successfully retrieved 10 news items")

// Warning logs (non-critical issues)
SDK.Log.Warn(ctx, "Network latency is high, performance may be affected")

// Error logs (execution failures)
SDK.Log.Error(ctx, "Failed to access target website, please check network connection")
Log Levels:
  • debug: Detailed debugging information (recommended during development)
  • info: Normal execution flow
  • warn: Warnings that do not stop execution
  • error: Critical errors that require attention

3. Result Submission – Sending Data Back to the Platform

After data collection, results must be returned to the platform in two steps.

Step 1: Define Table Headers (Required)

Before pushing data, define the table structure (similar to defining column headers in Excel):
headers := []*cafesdk.TableHeaderItem{
    {
        Label:  "Title",
        Key:    "title",
        Format: "text",
    },
    {
        Label:  "Content",
        Key:    "content",
        Format: "text",
    },
}

ctx := context.Background()
res, err := cafesdk.Result.SetTableHeader(ctx, headers)
Field Description:
  • Label: Column name displayed to users
  • Key: Unique identifier used in code
  • Format: Data type, supported values:
    • "text" – string
    • "integer" – integer
    • "boolean" – true/false
    • "array" – list
    • "object" – dictionary/object

Step 2: Push Data Row by Row

After defining headers, push the collected data one record at a time:
type result struct {
    Title   string `json:"title"`
    Content string `json:"content"`
}

resultData := []result{
    {Title: "Sample Title 1", Content: "Sample Content 1"},
    {Title: "Sample Title 2", Content: "Sample Content 2"},
}

ctx := context.Background()

for _, datum := range resultData {
    jsonBytes, _ := json.Marshal(datum)

    res, err := cafesdk.Result.PushData(ctx, string(jsonBytes))
    if err != nil {
        cafesdk.Log.Error(ctx, fmt.Sprintf("Failed to push data: %v", err))
        return
    }
    fmt.Printf("PushData Response: %+v\n", res)
}
Important Notes:
  1. The order of setting headers and pushing data does not matter
  2. Keys in pushed data must exactly match the keys defined in headers
  3. Data must be pushed one record at a time
  4. Logging after each push is recommended for traceability

⚠️ Common Issues & Notes

  1. File placement: Ensure all SDK files are located in the script directory
  2. Imports: Use SDK or CafeSDK directly to access SDK functionality
  3. Key consistency: Data keys must exactly match table header keys
  4. Error handling: Always check return values, especially when pushing data

⭐ Script Entry File (main.go)

💡 Example Code

// (Code unchanged, original example retained verbatim)
(The complete Go example remains exactly as provided above.)

Automated Data Collection Script: Workflow & Principles

1. Script Overview

This script is an automation Script that works like a digital employee.
It automatically opens target web pages (such as social media sites), extracts required information, and organizes the data into structured tables.

2. How Does It Work?

The entire process can be simplified into four main stages:

Step 1: Receive Instructions (Input Parameters)

Before execution, you provide instructions such as:
  • Target page URL
  • Number of records to collect

Step 2: Stealth Preparation (Proxy Network Configuration)

To reliably access overseas or restricted websites, the script automatically configures a secure proxy channel.

Step 3: Automated Execution (Business Logic Processing)

This is the core stage where the script:
  • Visits target pages
  • Extracts titles, content, images, and other required data

Step 4: Result Reporting (Data Push & Table Generation)

After collection:
  • Raw data is converted into standardized formats
  • Results are saved to the system
  • Table headers (e.g., “URL”, “Content”) are automatically generated