interesting, but I would still prefer pandas for data cleansing/manipulation, just because I won't be limited by SQL syntax - and can always use df.apply() and/or any python package for custom processing.
pandas using apache arrow backend also makes it high performance and compatible with cloud native data lakes
plus compatibility with sklearn package makes it a killer feature, with just few lines you can bolt on ML model on top of your data
pandas using apache arrow backend also makes it high performance and compatible with cloud native data lakes
plus compatibility with sklearn package makes it a killer feature, with just few lines you can bolt on ML model on top of your data