MetidaStats

Metida descriptive statistics - provide tables with categirized descriptive statistics from tabular data.

*This program comes with absolutely no warranty. No liability is accepted for any loss and risk to public health resulting from use of this software.

Contents

Example

using MetidaStats, CSV, DataFrames;

ds = CSV.File(joinpath(dirname(pathof(MetidaStats)), "..", "test", "csv",  "ds.csv")) |> DataFrame

For DataFrame ds:

ds[1:5, :]
5×7 DataFrame
Rowcolrows1Variablevar1var2var3
String1String1String1String1Float64Float64?Float64?
1aceg65.0591-41.3106-41.3106
2aceg68.825missingmissing
3aceg21.3784NaN16.2079
4aceg52.0018-0.193488-0.193488
5aceg68.62955.44529missing

Import:

di  = MetidaStats.dataimport(ds, vars = [:var1, :var2], sort = [:col, :row])
DataSet: observations
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var1, :row => String1("d"), :col => String1("a")); Length: 44
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var2, :row => String1("d"), :col => String1("a")); Length: 44
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var1, :row => String1("c"), :col => String1("b")); Length: 20
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var2, :row => String1("c"), :col => String1("b")); Length: 20
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var1, :row => String1("c"), :col => String1("a")); Length: 24
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var2, :row => String1("c"), :col => String1("a")); Length: 24
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var1, :row => String1("d"), :col => String1("b")); Length: 83
  Var: Variable; ID: Dict{Symbol, Any}(:Variable => :var2, :row => String1("d"), :col => String1("b")); Length: 83

Statistics:

des = MetidaStats.descriptives(di; skipmissing = true, skipnonpositive = true, stats = MetidaStats.STATLIST)
 ---------- --------- --------- --------- --------- ---------- --------- -------
  Variable       row       col         n      posn       mean       var      b   Symbol   String1   String1   Float64   Float64    Float64   Float64   Floa ⋯
 ---------- --------- --------- --------- --------- ---------- --------- -------
      var1         d         a      44.0      44.0    58.0626   726.402   709. ⋯
      var2         d         a      44.0      24.0    1.88004   838.634   819. ⋯
      var1         c         b      20.0      20.0    51.8411   640.079   608. ⋯
      var2         c         b      20.0       9.0   0.435363   758.656   720. ⋯
      var1         c         a      24.0      24.0    51.8434   941.195   901. ⋯
      var2         c         a      22.0      10.0   -3.24275   714.676   682. ⋯
      var1         d         b      83.0      83.0    47.2578   737.991     72 ⋯
      var2         d         b      83.0      39.0    -3.2516   830.511   820. ⋯
 ---------- --------- --------- --------- --------- ---------- --------- -------
                                                              25 columns omitted

Make DataFrame

df = DataFrame(des)
8×32 DataFrame
RowVariablerowcolnposnmeanvarbvarlogmeangeomlogvarsdsecvgeocvlci_0.95uci_0.95lmeanci_0.95umeanci_0.95medianminmaxrangeq1q3iqrkurtskewharmmeansesseksum
SymbolString1String1Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
1var1da44.044.058.0626726.402709.8933.880348.43850.54135926.95184.0631446.418684.75493.70907112.41649.868566.256755.75193.4362999.184995.748640.665282.377441.7122-0.83033-0.24654532.1770.3574840.7016762554.76
2var2da44.024.01.88004838.634819.5742.7578315.76561.426228.95924.365761540.35177.845-56.521760.2818-6.9243610.68442.4562-49.868446.319196.1875-25.08628.293353.3793-1.20617-0.034999436.19850.3574840.70167682.7216
3var1cb20.020.051.8411640.079608.0753.8138145.32280.31352925.29985.6572148.802660.6832-1.11199104.79440.000463.681752.775518.295899.652581.356733.143567.254634.1111-1.020570.21072838.79090.5121030.9923841036.82
4var2cb20.09.00.435363758.656720.7233.1150122.53360.38310227.54376.158966326.6168.3248-57.214358.085-12.455513.3262-2.65636-48.114342.83890.9522-18.987826.929245.917-1.04576-0.0518575-22.22140.5121030.9923848.70725
5var1ca24.024.051.8434941.195901.9793.6827439.75510.70544930.67896.2623159.1761101.23-11.6208115.30838.888864.79864.34518.3748494.744786.369819.940571.464951.5244-1.45164-0.16555327.41610.4722610.9177771244.24
6var2ca22.010.0-3.24275714.676682.1912.631113.8890.75747826.73345.69959824.405106.437-58.83852.3525-15.09578.61019-2.15984-48.711548.899497.611-16.525613.29929.8246-0.444083-0.0339916-4.370310.4909620.95278-71.3406
7var1db83.083.047.2578737.991729.13.581735.93460.85374327.1662.9818557.4847116.121-6.78402101.341.325953.189646.26150.48299699.800199.317124.742166.967842.2257-1.029830.13114815.39430.2641740.5226133922.4
8var2db83.039.0-3.2516830.511820.5052.9046618.2590.7955228.81863.16325886.289110.254-60.58154.0778-9.544313.04111-4.06996-47.438648.094495.533-30.271819.07649.3478-1.300680.0524108228.7310.2641740.522613-269.883

Reference

Textbooks:

https://towardsdatascience.com/5-free-books-to-learn-statistics-for-data-science-768d27b8215

Statistics for Julia:

https://statisticswithjulia.org/