Hasty Briefsbeta

Bilingual

Data Manipulation in Clojure Compared to R and Python

3 days ago
  • #Clojure
  • #data-science
  • #comparison
  • Comparison of data manipulation in Clojure (tablecloth), R (dplyr), Python (Pandas, Polars).
  • Reading data: tablecloth interprets 'NA' as missing by default; R uses read_csv; Pandas has built-in NaN values; Polars uses null_values.
  • Basic exploration: Commands for viewing rows, column names, selecting columns/rows, and sorting are compared across libraries.
  • Advanced filtering: Selecting columns except one, columns starting with a string, numeric columns, and filtering rows by range.
  • Reshaping data: Pivoting to longer format in tablecloth, dplyr, Pandas, and Polars.
  • Creating/renaming columns: Adding columns based on existing ones and renaming columns, with emphasis on immutability in Clojure.
  • Grouping and aggregating: Summarizing counts and finding minimum values by group, with different approaches in each library.
  • Conclusions: All libraries are suitable but differ in philosophy (functional vs. mutable), impacting readability and maintainability.
  • Versions: Lists of language and library versions used in the post.