-->
Home » » SOFA open source, user friendly statistical software designed for statistics , data analysis , and reporting

SOFA open source, user friendly statistical software designed for statistics , data analysis , and reporting

SOFA : Open source , user friendly statistical software designed for statistics , data analysis , and reporting . It became very popular , as its easy to install , on different platforms.


Importing and Compatibility.

One of the most useful feature i count ” as a developer ” its database friendly , as it can import data from many database engines such as ” MySQL , MS Access , SQLite , PostgreSQL , MS SQL server ” as it plans to go for oracle as well . Importing as not just from database engines but it also can import from Office documents either , MS Excel or OpenOffice Calc .



Output : SOFA : can do outputs in simple HTML file so it can be used for internet or website or even in spreadsheet file , it produces colorful outputs of many types of graphs such as bar charts , pie charts , line charts single or multi-pule ”


Available Tests :


* Row and column percentages, with the ability to nest variables e.g look at Ethnicity and Gender vs Age
* Mean
* Median
* Standard Deviation
* Sum
* N items
* Min
* Max
* Range
* Pearson’s Chi-Square with Contingency Tables
* Independent samples t-test
* Paired samples t-test
* One-way ANOVA
* Mann Whitney U
* Wilcoxon Signed Ranks
* Kruskal Wallis H
* Pearson’s Correlation
* Spearman’s Correlation


Download :


Linux : Debian / Ubuntu package is listed , and for non debian based distros the linux package is available in *.tar.gz package . Windows and Mac OS X , binaries are also available in the download section .


Links :


Deb packages are supplied for download on the main SOFA website. To cater to other flavours of Linux, a tar.gz is also provided. Inside, you will find README.txt and INSTALL.sh.
  • Step 1 is to use your distro package manager to install all the required support packages e.g. matplotlib (for chart plotting). Details of required packages are in the next subsection.
  • Step 2 is to run INSTALL.sh as described in README.txt.
The process is quite simple and has been achieved in two very different distros. SOFA works on Fedora 14:
and openSUSE 11.3:
This page is the go-to place for information on how to successfully install SOFA on non-Ubuntu Linux systems. For direct discussion, please post at SOFA Statistics google discussion group.
And if you manage to get SOFA working on other distros please email me (grant@sofastatistics.com) the relevant package details etc and a screen-shot (preferably one which reveals the distro involved).

Packages Required (Dependencies)

In Ubuntu SOFA requires:
  • python (>= 2.6.2),
  • wx-common (>= 2.8.9.2),
  • python-wxversion (>= 2.8.9.2),
  • python-wxgtk2.8 (>= 2.8.9.2),
  • python-numpy (>= 1:1.2.1),
  • python-pysqlite2 (>= 1.0.1),
  • python-mysqldb (>= 1.2.2),
  • python-pygresql (>= 1:4.0),
  • python-matplotlib
In Fedora 14 I installed the following successfully:
  • Python was already there
  • wxPython-2.8.11… and that brought with it some other packages needed.
  • numpy-1:1.4.1…
  • python-sqlite2-1:2.3.5…
  • MySQL-python-1.2.3…
  • PyGreSQL-3.8.1…
  • python-matplotlib-1.0.0…
  • for more recent versions of fedora you will need to separately install python-matplotlib-wx (otherwise you get a message about “No module named backend_wxagg”)
In openSUSE 11.3 I installed the following successfully AFTER I had added the community devel:languages:python and education repositories:
  • python-wxGTK 2.8.10.1…
  • python-numpy (NB to upgrade the existing version 1.3… to the later education repo version 1.5… - see Python matplolib on openSUSE)
  • python-mysql 1.2.2-90.1
  • PyGreSQL 3.8.1…
  • python-matplotlib 1.0.0…
  • python-sqlite2 2.6.0…
  • python-webkit (upgraded)
  • python-webkitgtk 1.1.8… (to avoid error about backend_wxagg module being missing)
I expect in other major distros there is a similar process of finding packages that seem right, trying, and adding more if necessary. It certainly should be possible to get SOFA working on the major distros.

Running SOFA

Make a launcher with the following details:
  • Name: SOFA Statistics
  • Description: Analysis package
  • Command: python /usr/local/share/sofa/start.py
  • Icon: /usr/local/share/sofa/images/sofa_48x48.xpm
You can run sofa from the command line with a single command sofastats (assuming you ran INSTALL.sh). If you want to set it up manually, details are in the Appendix:

Installation and Configuration for Specific User

When SOFA is run, it checks to see if the user has a sofa folder and adds it if they don't e.g. /home/username/sofa. It also make a sofa_recovery folder.
If you are able to get SOFA to launch at all, but there is a problem of some sort, look at the output.txt file in your /home/username/sofa/_internal folder. It may be, for example, that you forgot to install matplotlib.

Appendix

Simple Launch from Command Line

Make a text file called runsofastats.sh with the following
#! /bin/bash
python /usr/local/share/sofa/start.py
And save it e.g. to your home folder. If bash is not located in /bin/bash on your system, use the command
which bash 
to find it.
Then make a symlink to it located in /usr/local/bin (NB give everyone rights to run it)
su root<br />ln -s /home/username/runsofastats.sh /usr/local/bin/sofastats<br />chmod a+x /usr/local/bin/sofastats
Now you can run SOFA Statistics from the command line by typing in
sofastats
See Linux by example - how to create symlink?

File Locations

Here is where things should go during installation (in Ubuntu it is /usr/share/pyshared/sofa):
/usr/local/share/sofa<br />/usr/local/share/sofa/boomslang<br />/usr/local/share/sofa/css<br />/usr/local/share/sofa/dbe_plugins<br />/usr/local/share/sofa/googleapi<br />/usr/local/share/sofa/googleapi/atom<br />/usr/local/share/sofa/googleapi/gdata<br />/usr/local/share/sofa/googleapi/gdata/docs<br />/usr/local/share/sofa/googleapi/gdata/oauth<br />/usr/local/share/sofa/googleapi/gdata/spreadsheet<br />/usr/local/share/sofa/googleapi/gdata/tlslite<br />/usr/local/share/sofa/googleapi/gdata/tlslite/integration<br />/usr/local/share/sofa/googleapi/gdata/tlslite/utils<br />/usr/local/share/sofa/images<br />/usr/local/share/sofa/_internal<br />/usr/local/share/sofa/locale<br />/usr/local/share/sofa/locale/gl_ES<br />/usr/local/share/sofa/locale/gl_ES/LC_MESSAGES<br />/usr/local/share/sofa/projs<br />/usr/local/share/sofa/reports<br />/usr/local/share/sofa/reports/sofa_report_extras<br />/usr/local/share/sofa/scripts<br />/usr/local/share/sofa/vdts/
In the following example, I downloaded the sofa source code into the Downloads folder in Fedora 14.
Then extract contents of sofa_0.9.21.orig.tar.gz into Downloads folder.
The next lot of commands were performed as root (NB the /* after sofa.main)
su root
cd Downloads/sofa/sofa_0.9.21.orig<br />cp -r sofa /usr/local/share<br />cp -r sofa.main/* /usr/local/share/sofa<br />cp runsofastats.sh /usr/local/share/sofa
In versions prior to 0.9.22, the file permissions were all incorrect. Here is how to fix them:
su root<br />chmod -R u=rwx /usr/local/share/sofa<br />chmod -R go=rx /usr/local/share/sofa
NB nothing will work without the dependencies installed. Running:
python /usr/local/share/sofa/start.py
will return a traceback because wxversion or whatever isn't available. So the next step is installing the dependencies.
After installing wxPython, but before adding the other dependencies, running sofa prematurely will result in a message about a problem with the first round of local importing.

Getting Started

Demonstration Data

Before analysing your own data, it can be helpful to play with the demonstration data provided with SOFA Statistics. Click the “Enter/Edit Data” button to get started.

This brings up the data selection dialog. Here you can look at existing data tables or make new ones. Here we just want to look at the demonstration data table “demo_tbl”. Click on “Open”.

Here you can see the data we will be test analysing using SOFA Statistics. Note the pale blue column - the background colour indicates the field is read-only. Typically, read-only fields are autonumbered or timestamps. 


Click on “Close” when you're finished looking.

Making a Simple Report Table

On the main SOFA form, click on “Report Tables”,

Let's start with a simple report table of Age Group vs Country. NB all of this data is fictitious and designed to allow features of the program to be demonstrated.
  1. For “Table Type” select “Crosstabs”. A cross tabulation shows one or more variables against one or more other variables e.g. Age Group in the rows and Country in the columns.
  2. We need to add a row so click on “Add” under the “Rows” label
  3. Select “Age Group” and either double click it or select “OK”.
Under the “Columns” label click on “Add” and add Country.
In the demonstration pane below you will see a rough illustration of what the table will look like. If you want to see the actual table, click on “Run”.

If “Add to report” is ticked, the output will also be saved to the end of the output file specified at the bottom of the form.

Extra Configuration of Report Table

Next you may want to configure the rows and/or columns. Let's add a total column and columns for row and column percentages.

  1. Click on “Config” under the “Columns” label
  2. Tick “Total” under the “Misc” heading
  3. Tick “Column %” and “Row %” under the “measures” heading
  4. Click on “OK” to see changes in demonstration table. NB to see actual results, click on “Run”.

If you click “Run” with “Add to report” ticked, you can view the result by clicking on the “View” button. This will open your default web browser so you can see the output.

The styling of your table can also be changed - here are some examples of different report tables:
 

 
 

 
 


Documentation on making report tables is extended in Making Report Tables

Anova

Click on the “Statistics” button on the main SOFA form.
Then click on the “CONFIGURE TEST” button (ANOVA should already be selected).

Let's look at whether there is a difference between the average ages in the 3 different countries. NB all the data here is fictitious and only for example purposes.
  1. Select the variable that is averaged (the one we think might vary between groups). In this case, select “Age”.
  2. Select the variable with the groups. In this case, select “Country” and then select “Group A” and “Group B”.
  3. Click on “Run” to see results.

In this case, there is probably a real difference (p has a vary small value). Looking at the mean age for each group and the distribution for each group will help us decide how important the difference is for the purpose at hand. NB a difference can be statistically significant and clinically/politically/practically etc insignificant.


Adserver                   610x250
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:






0 commenti:

Post a Comment

Random Posts

Recent Posts

Recent Posts Widget

Popular Posts

Labels

Archive

page counter follow us in feedly
 
Copyright © 2014 Linuxlandit & The Conqueror Penguin
-->