# LogBus Windows Version Guide
This section mainly introduces how to use the Windows version of the data transmission tool LogBus:
**Before starting the docking, you need to read the **data rules first. After you are familiar with the data format and data rules of the TA, read this guide for docking.
**LogBus upload data must follow TA's **data format
# Download LogBus Windows version
**The latest version is **: 1.3.0
**Updated **: 2021-10-19
Download address (opens new window)
# I. Introduction of LogBus
The LogBus tool is mainly used to import the back-end log data into the TA background in real time. Its core working principle is similar to Flume. It will monitor the file flow under the server log directory. When any log file under the directory has new data, It will verify the new data and send it to the TA background in real time.
The following categories of users are recommended to use LogBus to access data:
- Users using the server level SDK upload data via LogBus
- High requirements for data accuracy and dimensions, only through the client side SDK can not meet the demands for data, or inconvenient access to the client side SDK
- Don't want to develop your own back-end data push process
- Need to transfer large quantities of historical data
# II. Data preparation Before Use
First, the data that needs to be transferred is converted into the data format of TA by ETL, and written locally or transmitted to the Kafka cluster. If the server level SDK is used to write local files or Kafka consumers, the data is already The correct format does not need to be converted.
Determine the directory where the uploaded data files are stored, or the address and topic of Kafka, and configure the relevant configuration of LogBus . LogBus will monitor file changes in the file directory (monitor file creation or tail existing files), or subscribe to Kafka Data.
Do not directly rename data logs stored in the monitoring directory and uploaded. Renaming logs is equivalent to creating new files. LogBus may upload these files again, causing data duplication.
Since the LogBus data transfer component contains data buffers, the LogBus directory may take up slightly more disk space, so please ensure that the disk space of the LogBus installation node is sufficient, and each item (that is, an additional APP_ID) must be reserved for at least 10G of storage space.
# III. Installation and Upgrade of LogBus
# 3.1 Installing LogBus
Download the LogBus compression package (opens new window)and decompress it.
Unzipped directory structure:
Bin: Launcher folder
Conf: configuration file folder
Lib: function folder
# IV. Parameter Configuration of LogBus
Enter the unzipped
conf
directory, which has a configuration filelogB us.conf. Template
, which contains all the configuration parameters of LogBus and can be renamed tologB us.conf
when used for the first time.Open the
logB us.conf
file for relevant parameter configuration
# 4.1 Project and data source configuration (must be configured)
- Project APP_ID
APP_ID non-repeatable configuration
##APPID is from token on TGA website. Please get the APPID of the accessing project from the project configuration page in TA background and fill it in here. Multiple APPIDs are split by''.
APPID=APPID_1,APPID_2
- Monitor file configuration (please select one, must be configured)
# 4.1.1. When the data source is a local file
The path and file name of the data file read by ##LogBus (file name supports ambiguous matching) requires read permission
##Different APPIDs are separated by commas, while different directories of the same APPID are separated by spaces
##TAIL_ FILE file name supports wildcard matching
TAIL_FILE=C:/path1/dir*/log.*,C:/path3/txt.*
TAIL_FILE Supports monitoring of multiple files in multiple subdirectories under multiple paths
The corresponding parameters are configured as:
APPID=APPID1,APPID2
TAIL_FILE=C:/root/log_dir1/dir_*/log.* C:/root/log_dir*/log*/log.*,C:/test_log/*
The specific rules are as follows:
- Multiple monitoring paths of the same APP_ID are divided by spaces
- The monitoring paths of different APP_ID are divided by **commas **"
,
", and the monitoring paths correspond to each other after being divided by commas APP_ID - The directory in the monitoring path supports monitoring through wild-card
- Filename support using wild-card monitoring
- The path delimiter can use "
/
" or "\\\
", **do not use "\
" **, for example:C:/root/_ log
orC:\\\ root\\ _ log
Do not store log files that need to be monitored in the server root directory.
# 4.1.2. When the data source is kafka
Parameters KAFKA_TOPICS
When you need to monitor multiple topics, you can use spaces to separate each topic; if there are multiple APP_ID, use half-corner commas to separate the topics monitored by each APP_ID. The parameter KAFKA_GROUPID
must be unique. Parameters KAFKA_OFFSET_RESET
, you can set the kafka.consumer.auto.offset.reset
parameters of Kafka, the preferred values are earliest
and latest
, and the default setting is earliest
.
Note: The Kafka version of the data source must be 0.10.1.0 or higher
Single APP_ID example:
APPID=appid1
######kafka configure
#KAFKA_GROUPID=tga.group
#KAFKA_SERVERS=localhost:9092
#KAFKA_TOPICS=topic1 topic2
#KAFKA_OFFSET_RESET=earliest
Multiple APP_ID examples:
APPID=appid1,appid2
######kafka configure
#KAFKA_GROUPID=tga.group
#KAFKA_SERVERS=localhost:9092
#KAFKA_TOPICS=topic1 topic2,topic3 topic4
#KAFKA_OFFSET_RESET=earliest
# 4.2 Configuration of transmission parameters (must be configured)
##Transfer Settings
##URL transmitted by
##For HTTP transmission use the
PUSH_URL=http://receiver.ta.thinkingdata.cn/logbus
##If you are using a privatized deployment service, modify the transfer URL to: http://Data Acquisition Address/logbus
##Maximum number per transmission
#BATCH=10000
##How often should it be transmitted at least once (in seconds)
#INTERVAL_SECONDS=600
##Number of transfer threads, default single thread, recommended for use when network conditions are poor, multi-thread will consume more memory and CPU resources
#NUMTHREAD=1
##Compressed format for file transfer:gzip,snappy,none
#COMPRESS_FORMAT=none
# 4.3 Converter Configuration (optional configuration)
##Converter type temporary supportjson csv regex splitter
#PARSE_TYPE=json
##Additional Fixed Attributes in Format:name value,name1 value1
#LABELS=
##Attribute name and type for PARSE_ TYPE: CSV regex splitter, format:name type, name1 type 1
##Types of support are:float int string date list bool
#SCHEMA=
##Specify separator, PARSE_ TYPE cannot be empty when CSV splitter
#SPLITTER=
##Specify the separator for the list type. When a list type exists, it defaults to:
#LIST_SPLITTER=,
##Regular expression, PARSE_ TYPE cannot be empty when regex
#FORMAT_REGEX=
# 4.4 Monitoring File Deletion Configuration (optional configuration)
# Monitor directory file deletion and remove comments to start the delete file function
#can only be deleted by day or hour
# UNIT_ REMOVE=hour
#Delete how long ago
# OFFSET_ REMOVE=20
#Delete uploaded monitoring files every few minutes
# FREQUENCY_REMOVE=60
# 4.5 Sample Configuration File
##################################################################################
## Thinkingdata Data Analysis Platform Transfer Tool LogBus Profile
##Non-comments are required parameters and comments are optional parameters which can be adapted to your own circumstances
##Proper configuration
##Environmental Requirements: java8+, see TGA website for more detailed requirements
##http://doc.thinkinggame.cn/tdamanual/installation/logbus_installation.html
##################################################################################
##APPID token from TGA website
##Different APPIDs are separated by commas and cannot be configured repeatedly
APPID=from_tga1,from_tga2
#-----------------------------------source----------------------------------------
######file-source
##The path and file name of the data file read by LogBus (file name supports ambiguous matching) requires read permission
##Different APPIDs are separated by commas, while different directories of the same APPID are separated by spaces
##TAIL_ FILE File Names Support Regular Expressions in the Java Standard
TAIL_FILE=C:/path1/log.* C:/path2/txt.*,C:/path3/log.* C:/path4/log.* C:/path5/txt.*
######kafka-source
#KAFKA_GROUPID=tga.flume
#KAFKA_SERVERS=
#KAFKA_TOPICS=
#KAFKA_OFFSET_RESET=earliest
#------------------------------------sink-----------------------------------------
##Transfer Settings
URL transmitted by ##
##If you are using a privatized deployment service, please modify the transfer URL to: http://Data Acquisition Address/logbus
##PUSH_URL=http://receiver.ta.thinkingdata.cn/logbus
PUSH_URL=http://${Data Acquisition Address}/logbus
##Maximum number per transmission
#BATCH=10000
##How often should it be transmitted at least once (in seconds)
#INTERVAL_SECONDS=60
##### http compress
##Compressed format for file transfer: gzip, snappy, none
#COMPRESS_FORMAT=none
##Do you add UUID attributes to each data
#IS_ADD_UUID=true
#------------------------------------parse----------------------------------------
##Converter type temporary support:json csv regex splitter
#PARSE_TYPE=json
##Additional Fixed Attributes in Format:name value,name1 value1
#LABELS=
##Attribute name and type for PARSE_ TYPE: CSV regex splitter in format:name type,name1 type1
##Types of support are:float int string date list bool
#SCHEMA=
##Specify separator, PARSE_ TYPE cannot be empty when CSV splitter
#SPLITTER=
##Specify the separator for the list type. When a list type exists, it defaults to:
#LIST_SPLITTER=,
##Regular expression, PARSE_ TYPE cannot be empty when regex
#FORMAT_REGEX=
#------------------------------------other----------------------------------------
##To start the delete file function, start the delete file program every hour by opening the note (you must open both fields below) to delete the file in the monitor directory.
##Delete files before offset by unit
##Delete how long ago
#OFFSET_ REMOVE=
##Only receive deletions by day or hour
#UNIT_REMOVE=
# V. Start LogBus
Please check the following before starting for the first time:
- Check the java version
Enter the bin
directory, there will be two scripts, check_java.bat
and logbus.bat
Where check_java
is used to detect whether the java version meets the requirements, execute the script, if the java version does not meet the Java version is less than 1.8
or Can't find java, please install jre first.
Wait for the prompt
You can update the JDK version or see the next section to install the JDK separately for LogBus
- Install LogBus's independent JDK
If the LogBus deploys a node, the JDK version does not meet the LogBus requirements due to environmental reasons, and cannot be replaced with the JDK version that meets the LogBus requirements. You can use this feature.
Enter the bin
directory, there will be install_logbus_jdk.bat
.
Running this script will add a new java directory to the LogBus working directory. LogBus will use the JDK environment in this directory by default.
- Complete the configuration of logB us.conf and run the parameter environment check command
For the configuration of logB us.conf, please refer to the Configuring LogBus section
After the configuration is completed, run the env
command to check whether the configuration parameters are correct
logbus.bat env
If the red exception information is output, there is a problem with the configuration and it needs to be modified again until the configuration file has no exception prompt.
When you modify the configuration of the logB us.conf, you need to restart LogBus for the new configuration to take effect
- Start LogBus
logbus.bat start
After the startup is completed, a logkit.exe will be opened. Please do not close it, otherwise it may cause data to be uploaded repeatedly.
# VI. Detailed LogBus Command
# 6.1 Help Information
Without arguments or --help or -h, help information will be displayed
Mainly introduce the commands of LogBus:
usage: logbus <Command | Auxiliary Command> [Option]
Command:
start Start logBus.
restart Restart logBus.
stop Exit logBus safely.
reset Reset logBus to read records
stop_atOnce Force logBus exit.
Auxiliary Command:
env Operating environment verification.
server [-url <url>|-url <url> -appid <appid>] Test Receiver Network.
show_conf Display current logBus configuration information.
version Display version number.
update Update logbus to latest version.
Option:
-appid <appid> Project appid
-h,--help Display Help Document and Exit.
-path <path> Specify the absolute path to the test file
-url <url> Specify the URL address for the test
Instance:
logbus.bat start Start logBus.
logbus.bat stop Exit logBus safely.
logbus.bat restart Restart logBus.
logbus.bat server -url http://${Receiver Address}/logbus -appid ***** Test Receiver Network
# 6.2 Transport Channel Check server -url
After you complete the format verification, you also need to check whether the data channel is open. You can use the server -url
command to check. While checking, you can enter the APP_ID you received on the TA platform. Note that APP_ID is bound to your project. Please make sure that the APP_ID you enter corresponds to your project before entering.
logbus.bat server -url http://${Receiver Address}/logbus -appid ${appid}
# 6.3 Display Configuration Information show_conf
You can use the show_conf
command to view the configuration information of LogBus, as shown in the following figure:
logbus.bat show_conf
# 6.4 Start Environment Check env
You can use env
to check the startup environment. If the output information is followed by an asterisk, it means that there is a problem with the configuration and you need to modify it again until there is no asterisk prompt.
logbus.bat env
# 6.5 Upgrade LogBus Version update
You can use update
to update the version online, this command will update the LogBus to the latest version.
logbus.bat update
# 6.6 Start start
After you have completed the format verification, data channel inspection and environment inspection, you can start LogBus to upload data. LogBus will automatically detect whether there is new data written to your file, and if there is new data, upload the data.
logbus.bat start
# 6.7 Stop stop
If you want to stop the LogBus, use the stop
command. It will take some time, but there will be no data loss.
logbus.bat stop
# 6.8 Stop stop_atOnce
If you want to stop LogBus immediately, use the stop_atOnce
command, which may cause data loss.
logbus.bat stop_atOnce
# 6.9 restart
You can use the restart
command to restart LogBus, which is suitable for making the new configuration take effect after modifying the configuration parameters.
logbus.bat restart
# 6.10 Reset reset
Using reset
will reset LogBus. Please use this command carefully. Once used, the file transfer record will be cleared and LogBus will upload all data again. If you use this command under unclear conditions, it may cause duplication of your data. It is recommended to communicate with TA staff before using it.
logbus.bat reset
After using the reset command, you need to perform start
to restart the data transfer.
# 6.11 View Version Number version
If you want to know the version number of the LogBus you are using, you can use the version
command. If your LogBus does not have this command, the version you are using is an earlier version.
logbus.bat version
# ChangeLog
# Version 1.3.0 --- 2021/10/19
- Support non-TA format data upload
# Version 1.2.0 --- 2021/05/26
- Support cygWin mode
# Version 1.1.0 --- 2020/08/28
- Support adding #UUID
- Support#event_id and #first_check_id
- Support multi-threaded sending
- Supports delimiter resolution and regular resolution
# Version 1.0.0 --- 2020/06/25
- LogBusiness-Windows Release