Running Tablular Cleaning Baselines
Table of Contents
-
-
Raha
https://github.com/BigDaMa/raha
PClean
1 |
|
Change Current Directory.
1 |
|
Activate the project and install dependencies.
1 |
|
run command
1 |
|
Cocoon
Running the Cocoon tutorial
https://github.com/Cocoon-Data-Transformation/cocoon
Garf
https://github.com/PJinfeng/Garf-master
Garf 需要使用Oracle数据库。
HoloClean
Git: https://github.com/HoloClean/holoclean
Relate Info
System Version:
1 |
|
0. Pre
1 |
|
1. Download code
1 |
|
2. Create Conda Env And Enter
1 |
|
3. Install dependency
1 |
|
4. Test
1 |
|
Question:
Q1:
Error in Insatll python-Levenshtein
:
- Ubuntu/Debian System:
1
sudo apt-get install build-essential python3-dev
- CentOS/Fedora System:
1
2yum groupinstall 'Development Tools'
yum install python3-dev
Q2:
Error Message:
1 |
|
Cause:
After Python 3.7, has future annotations
So, degrade the version of smart_open
1 |
|
https://github.com/delgaudl/RTClean
RTClean
Download
1 |
|
1 |
|
1. Create conda env
1 |
|
2. Modify requirements.txt
1 |
|
If you are using proxy, you may need set:
1 |
|
1 |
|
3. Modify Code
Error Message:
1 |
|
Replace all time.clock()
to time.time()
4. Test holoclean
examples/holoclean_repair_example.py
5. Install extra requirements
1 |
|
Running Tablular Cleaning Baselines
https://www.hardyhu.cn/2025/02/24/Running-Tablular-Cleaning-Baselines/