Saturday, October 22, 2016

Google Maps V2 Android Tutorial 0 - Setup

The Google Maps V2 Android API is very useful for building Android apps that need a map or some sort of navigating platform. These next series of tutorials will teach you how to build your own app while using Google Maps.

In this first tutorial, we are simply going to go through setup:

1. The very first thing we need to do is to download Android Studio.

2. Then we need to add google play services to android studio.

3. Start Android Studio

4. If you see the welcome dialog, choose Start a new Android Studio Project, available under quickstart. Otherwise, click File>New>New Project

5. Enter your app name, company domain, and project location. Click Next.

6. Enter activity name, layout name, and title. Click Finish.

7. After the gradle is done building, go to google_maps_api.xml

8. Follow the instructions in that file.

You have now completed setup, let's take a look at the code that has been automatically generated by android studio.

The XML File

This XML file can be found in your projects tab under res/layout/YOUR_ACTIVITY_NAME.xml
<fragment xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:id="@+id/map"
    tools:context=".MapsActivity"
    android:name="com.google.android.gms.maps.SupportMapFragment" />
This file defines how your app will appear on the screen when your app is compiled and built to an android device. Using Android Studio, you can fiddle around with the widgets and add buttons, textboxes, popups, controls, etc. We will go further into detail about the activity.xml file in future tutorials.

The Maps Activity Java File

This file can be located in your projects tab under java/com.xxx.xxx/YOUR_ACTIVITY_NAME.java. This file contains the code that will run when you start your Google Maps Application. By default, your activity.java file should look something like this:

import android.os.Bundle;
import android.support.v4.app.FragmentActivity;
import com.google.android.gms.maps.CameraUpdateFactory;
import com.google.android.gms.maps.GoogleMap;
import com.google.android.gms.maps.OnMapReadyCallback;
import com.google.android.gms.maps.SupportMapFragment;
import com.google.android.gms.maps.model.LatLng;
import com.google.android.gms.maps.model.MarkerOptions;
public class MapsActivity extends FragmentActivity implements OnMapReadyCallback {

    private GoogleMap mMap;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_maps);
        SupportMapFragment mapFragment = (SupportMapFragment) getSupportFragmentManager()
                .findFragmentById(R.id.map);
        mapFragment.getMapAsync(this);
    }

    @Override
    public void onMapReady(GoogleMap googleMap) {
        mMap = googleMap;

        // Add a marker in Sydney, Australia, and move the camera.
        LatLng sydney = new LatLng(-34, 151);
        mMap.addMarker(new MarkerOptions().position(sydney).title("Marker in Sydney"));
        mMap.moveCamera(CameraUpdateFactory.newLatLng(sydney));
    }
}

Android Manifest File

The last file we will be looking at is the android manifest file which can be found in the project tab under manifests/AndroidManifest.xml. This file contains all the permissions, details, libraries, and specifications about your application. The android manifest is vital to your program as it is needed for any application before it can open and run successfully.

Throughout the construction of your Google Maps Android Application, you will be mainly modifying these three files.

As of right now, you are probably going to need a physical android device to run your Google Maps Application since android virtual device does not have location services capability. Make sure developer tools is enabled on your android device. Then, plug in your android device and click the green play button at the top of android studio. This compiles your code and installs the apk (Android Package Kit) into your android device. Run the app to make sure it's working. By default, it should be showing a simple map with which you can zoom and move round.

In the next tutorial, I will be showing you how to play with markers as well as use the location services of your device and integrate into your application. Stay tuned!

Wednesday, July 6, 2016

Google Maps JavaScript API Tutorial - 1 - Basic API Overview

Now that we have setup our Google Maps JavaScript API Key, lets begin building our web app. First let's start with allowing our app to display the map by calling the google maps API. Create an index.html in your desired folder that will be your webpage. To call the API, we use a html script tag:
<script src="https://maps.googleapis.com/maps/api/js?key=MYAPIKEY&amp;v=3&amp;callback=initMap" async="" defer="defer"></script>
Now that we have called the API we need to show the map on our html page and define our initMap function.
<div id="map"></div>
<script>
var map;
function initMap() {
map = new google.maps.Map(document.getElementById('map'), {
center: {lat: 40.7413594, lng: -73.9980244},
zoom: 13
});
}
</script>
Now our map is good to go. Test out your web application by navigating to your html file in your file explorer and opening it in your browser. For the full code of this tutorial please visit here.

Google Maps JavaScript API Tutorial - Setup

The Google Maps Javascript API is a powerful set of tools that can be called to build your very own web mapping application. The very first thing you will need to do is to setup your own Google Developer Project. ### Setup 1. On your browser, visit https://console.developers.google.com. 2. Next sign in with your existing gmail account. (If you don't have one, please create one here.) 3. Click on Select a project and then click on Create a project in the drop down menu. 4. Enter a creative project name and read and agree to the terms of service. 5. Click Create. 6. You will need to enable the following APIs under Google Maps APIs by clicking and enabling each of them.
  • Google Maps Javascript API
  • Google Maps Roads API
  • Google Static Maps API
  • Google Street View Image API
  • Google Places API Web Service
  • Google Maps Geocoding API
  • Google Maps Directions API
  • Google Maps Distance Matrix API
  • Google Maps Geolocation API
  • Google Maps Elevation API
  • Google Maps Time Zone API
7. Click the Credentials menu item on the left of the screen 8. Click Create Credentials 9. Select API Key and select Browser Key. 10. Name your API Key and click create 11. Let's start mapping. In the next tutorial, we will be talking about basic techniques to access the Google maps JavaScript API to start off our web application.

Google Maps JavaScript API Tutorial - 2 - Markers

Now that we have our map up and running lets play around with saving different points on our map. In the Google Maps API, these points are called markers.
In order, to set a fixed marker on our map we must first get the position/location in lat/long.
var tribeca = {lat: 40.719526, lng: -74.0089934};
Now that we have saved our location in a variable called tribeca, let's create our marker.
var marker = new google.maps.Marker({
  position: tribeca,
  map: map,
  title: 'My Marker!'
});
Now if we run our html file on our browser, we see our newly placed marker.

To view the full code, please visit here.
Stay tuned for the next lesson: Marker Infowindows

Sunday, May 22, 2016

ZetaMachina - Easy Machine Learning in R

Introducing ZetaMachina.

Machine Learning in R



What is machine learning? Quote from Wikipedia, "Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence." In other words, it is using previously given data, to predict future outcomes of current data. This can help us understand correlation between two different events or ideas in the world.

There are multiple types of Machine Learning. There is regression and classification. In both types, we are training a model from given, previous data, and using that model to help us predict outcomes based on the testing data (Data we do not know the outcome to). Regressions is when we are given a numerical input and are expected to find an output. Classification is when we are given a categorical input and are expected to find an output. In this post, we will mainly focus on classification.

We have built ZetaMachina, an implementation for classification prediction in R. This post will not show the full code but please check the GitHub Repo for it.

In this project we created our own people data frame as a demo.

We create a female and male data frame and then merge it. The whole data frame includes people with three atrributes: their race, hair-color, and gender. The two input categorical variables are race and hair color, while the output variable is the gender another categorical.

female <- air="femaleHair," br="" data.frame="" gender="F" race="femaleRace,">people <- br="" female="" male="" rbind="">count <- 1:1000="" br="">ind <- 800="" count="" replace="FALSE)<br/" sample="">training <- br="" ind="" people="">testing <- br="" ind="" people="">

We have now create our people data frame as well as the training data for our model and our tesing data.

We then create our model using ZetaNaiveBayes. Naive Bayes assume that events are independent.

model <- ace="" air="" c="" code="" ender="" training="" zetanaivebayes="">

Now we create a prediction based on the model and our testing data.

prediction <- br="" model="" testing="" zetapredict="">prediction <- br="" factor="" levels="c(" prediction="">

If we print out the result we get our prediction:
#console
[1] M M M M F M M M M M M M M M M M M M M M M F M F M M F M M M F M M M M F M M M M M M M M F M M M M M F M M M M M
[57] M M M M M M M M M M M M M M M F M M M M M M M M M M M M F M F M M M M M M M M M M M M M F M F F F M F F F M M F
[113] F F F F F F M F M M F F F F M F F F F F F M F F M M F F F F F F M F F F M F F F M F F F F F F M F F F F F M F F
[169] F F F F M F F F M F F F M F F M M F F F F M M F F M F M F F F F
Levels: M F


If you enjoyed this project, visit our other projects on our GitHub Page

Saturday, May 7, 2016

NumPy Tutorial

NumPy is a fundamental Python package used for scientific computing. Within the package, there are many useful features including n-dimensional arrays, element-wise operations, broadcasting functions, linear algebra, random number capabilities, etc. This tutorial provides some fundamental examples of those features in NumPy.

Create An Array with NumPy

In [26]:

import numpy as np #import from python libary
a = np.array([1, 2, 3, 4, 5])
print a
print a*2

Out [26]:

[1 2 3 4 5]
[ 2  4  6  8 10]

Create A One-Dimensional Array

In [27]:

a = np.arange(20) # one dimension
print a
print a.shape
print a.ndim
print a.dtype

Out[27]:
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
(20L,)
1
int32

Create A Two-Dimensional Array

In [28]:

a = np.array([[1, 2, 3], [4, 5, 6]]) # two dimensions
print a 
print a.shape
print a.ndim
print a.dtype

Out[28]:

[[1 2 3]
 [4 5 6]]
(2L, 3L)
2
int32

Create A Three-Dimensional Array

In [29]:

a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) # three dimensions
print a
print a.shape
print a.ndim
print a.dtype

Out[29]:
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
(2L, 2L, 3L)
3
int32

How to Generate an Array from a Sequence

In [21]:

print np.arange(20, 30, 2) # return evenly space numbers over a specified interval.(integers)
print np.linspace(0, 1, 20) # return evenly space numbers over a specified interval.(float numbers)
print np.random.rand(5)# returns random numbers between 0 and 1

Out[21]:

[20 22 24 26 28]
[ 0.          0.05263158  0.10526316  0.15789474  0.21052632  0.26315789
  0.31578947  0.36842105  0.42105263  0.47368421  0.52631579  0.57894737
  0.63157895  0.68421053  0.73684211  0.78947368  0.84210526  0.89473684
  0.94736842  1.        ]
[ 0.05481854  0.83857248  0.31190602  0.30903261  0.17243025]

Indexing and Slicing

Indexing and Slicing are very important when dealing with vast amounts of data. This allows you to cut down the array and narrow it to the part of the data you want to analyze.

In [39]:

print a
print a[0:2, 0:2, 2]

Out[39]:
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
[[ 3  6]
 [ 9 12]]

Manipulating Array Shape

Array shape manipulation can help when needing to perform operations with other multi-dimensional arrays.

In [54]:

b = np.copy(a)
b.shape = (12L,)
print b

Out[54]:
[ 1  2  3  4  5  6  7  8  9 10 11 12]

Boolean Masking

Boolean Masking can help with cutting down an array given a specific condition. In this case the array "b" is a list of boolean values which are true when the value of a is a multiple of 5. "b" is then masked against "a" to get those values.

In [72]:

a = np.arange(0, 105) + 1
b = a%5==0
print a[b]

Out[72]:
[  5  10  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90
  95 100 105]

Element-wise Operations

Element wise operations are used to efficiently perform functional operations against each element of the array.

In [77]:

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print a+b

Out[77]:
[5 7 9]

Matrix Multiplication

In [78]:

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[7, 8], [9, 10], [11, 12]])
a.dot(b)

Out[78]:

array([[ 58,  64],
       [139, 154]])

Logical Operations

In [86]:

c = a>2
d = a>8
print c
print d
print np.logical_and(c, d)

Out[86]:

[[False False  True]
 [ True  True  True]]
[[False False False]
 [False False False]]
[[False False False]
 [False False False]]

Basic Reductions

In [106]:

a.shape = (6L) # One dimension
print a
print np.sum(a)
print np.mean(a)
print np.std(a)
print np.size(a)

Out[106]:

[1 2 3 4 5 6]
21
3.5
1.70782512766
6

In [125]:

a = np.arange(0, 105) + 1
a.shape = (3L, 5L, 7L)
print a
print np.sum(np.sum(a, axis=2), axis=0)

Out[125]:

[[[  1   2   3   4   5   6   7]
  [  8   9  10  11  12  13  14]
  [ 15  16  17  18  19  20  21]
  [ 22  23  24  25  26  27  28]
  [ 29  30  31  32  33  34  35]]

 [[ 36  37  38  39  40  41  42]
  [ 43  44  45  46  47  48  49]
  [ 50  51  52  53  54  55  56]
  [ 57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70]]

 [[ 71  72  73  74  75  76  77]
  [ 78  79  80  81  82  83  84]
  [ 85  86  87  88  89  90  91]
  [ 92  93  94  95  96  97  98]
  [ 99 100 101 102 103 104 105]]]

array([ 819,  966, 1113, 1260, 1407])

In [126]:

sum(range(1, 8)+range(36, 43)+range(71, 78))

Out[126]:

819

All of these examples shown above are the basics of operation with the NumPy library. It can greatly help a coder with efficiency while working with big data. For more information on NumPy, visit here

Tuesday, February 16, 2016

Introduction to Pandas

Introduction

One of the biggest problem's today is how we can quickly analyze data. For example, let's say we wanted to give a specific class in school a survey asking each student their favorite food and favorite color and we wanted to know the most popular color and food for the theme of the next dance. Initially, it would be practical by counting. However, what would happen if we increased the survey population to include even more classes, or even more schools? Counting would just be too tedious and eventually be impossible because it would take too long.This is the same problem in the real world.

The solution to this problem is being able to store the data in an efficient manner. In order to store information in an organized manner, programmers use dataframes. A data frame is a table (or a two-dimensional array-like structure) which stores data. Each column contains measurements on one variable and each row contains one case. Using this data frame, we can easily access the information for analysis.

In this post, we will study data frames in Python using the library called Pandas

Data Frame Basics in Pandas

In this next section, we will start pandas, read in a csv file to store in a data frame, and then explore its different attributes.

In [3]:
import pandas as pd # start pandas
df = pd.read_csv("./data/2012/weather-2012-01-01.csv") # read in csv file from path and store in data frame, df
print df.shape # (number of rows, number of columns)
(23, 14)
In [4]:
print df.columns # names of columns
Index([u'TimeEST', u'TemperatureF', u'Dew PointF', u'Humidity',
       u'Sea Level PressureIn', u'VisibilityMPH', u'Wind Direction',
       u'Wind SpeedMPH', u'Gust SpeedMPH', u'PrecipitationIn', u'Events',
       u'Conditions', u'WindDirDegrees', u'DateUTC'],
      dtype='object')
In [5]:
print df.index # names of rows
Int64Index([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            17, 18, 19, 20, 21, 22],
           dtype='int64')
In [8]:
print df.dtypes # data types of columns
TimeEST                  object
TemperatureF            float64
Dew PointF              float64
Humidity                  int64
Sea Level PressureIn    float64
VisibilityMPH           float64
Wind Direction           object
Wind SpeedMPH            object
Gust SpeedMPH            object
PrecipitationIn         float64
Events                  float64
Conditions               object
WindDirDegrees            int64
DateUTC                  object
dtype: object
In [9]:
print df.describe() # shows 1-var statistics of numeric columns of data frame (mean, std, min ...)
       TemperatureF  Dew PointF   Humidity  Sea Level PressureIn  \
count     23.000000   23.000000  23.000000             23.000000   
mean      52.613043   39.247826  63.782609             30.032174   
std       12.038699    6.323180  19.376547              0.117162   
min       35.100000   32.000000  34.000000             29.870000   
25%       39.100000   33.800000  46.000000             29.915000   
50%       57.900000   37.900000  58.000000             30.040000   
75%       63.500000   44.550000  81.500000             30.150000   
max       66.900000   52.000000  92.000000             30.160000   

       VisibilityMPH  PrecipitationIn  Events  WindDirDegrees  
count             23             1.00       0       23.000000  
mean              10             0.01     NaN      131.304348  
std                0              NaN     NaN      121.292447  
min               10             0.01     NaN        0.000000  
25%               10             0.01     NaN        0.000000  
50%               10             0.01     NaN      200.000000  
75%               10             0.01     NaN      210.000000  
max               10             0.01     NaN      310.000000  
In [10]:
print df.head(5) # shows first five records of a dataframe.
    TimeEST  TemperatureF  Dew PointF  Humidity  Sea Level PressureIn  \
0  12:51 AM          39.2        33.8        81                 30.15   
1   1:51 AM          39.2        33.8        81                 30.16   
2   2:51 AM          39.0        34.0        82                 30.15   
3   3:51 AM          36.0        33.1        89                 30.15   
4   4:51 AM          35.1        33.1        92                 30.15   

   VisibilityMPH Wind Direction Wind SpeedMPH Gust SpeedMPH  PrecipitationIn  \
0             10           Calm          Calm             -              NaN   
1             10           Calm          Calm             -              NaN   
2             10           Calm          Calm             -              NaN   
3             10           Calm          Calm             -              NaN   
4             10           Calm          Calm             -              NaN   

   Events Conditions  WindDirDegrees              DateUTC  
0     NaN      Clear               0  2012-01-01 05:51:00  
1     NaN      Clear               0  2012-01-01 06:51:00  
2     NaN      Clear               0  2012-01-01 07:51:00  
3     NaN      Clear               0  2012-01-01 08:51:00  
4     NaN      Clear               0  2012-01-01 09:51:00  

Extracting Data

In this section, we will slice information from our data frame.

In [12]:
print df.loc[:,['TemperatureF', 'Humidity']] # shows all records for the two columns: Temperature and Humidity
    TemperatureF  Humidity
0           39.2        81
1           39.2        81
2           39.0        82
3           36.0        89
4           35.1        92
5           37.0        86
6           37.9        83
7           37.9        86
8           46.9        71
9           53.6        58
10          60.1        44
11          63.0        38
12          66.9        34
13          66.0        38
14          66.9        44
15          64.9        48
16          64.0        54
17          64.0        56
18          60.1        75
19          59.0        78
20          60.1        57
21          57.9        51
22          55.4        41
In [14]:
df.loc[df.TemperatureF > 40,['TemperatureF', 'Humidity']]
#boolean indexing: Shows temperature and humidity for all records that have a temperature greater than 40
Out[14]:
TemperatureF Humidity
8 46.9 71
9 53.6 58
10 60.1 44
11 63.0 38
12 66.9 34
13 66.0 38
14 66.9 44
15 64.9 48
16 64.0 54
17 64.0 56
18 60.1 75
19 59.0 78
20 60.1 57
21 57.9 51
22 55.4 41

In this next experiment, we read all the csv files from a specific directory and concatenate all the data frames together.

In [15]:
import glob
path = "./data/2012" # path for csv files
allFiles = glob.glob(path + "/*.csv") # store all the csv files into "allFiles"
df2012 = None
first_time = True
for myfile in allFiles: # run through each csv file
    df5 = pd.read_csv(myfile)
    s = df5.columns
    scopy = [s[i] for i in range(0, len(s))]
    scopy[0] = 'TimeEST'
    df5.columns = scopy
    if first_time:# check if this is the first data frame.
        df2012 = df5
        first_time = False
    else:
        df2012 = df2012.append(df5, ignore_index=True) #concatenate the data frames

print df2012.shape
print df2012.describe()
(9617, 14)
       TemperatureF   Dew PointF     Humidity  Sea Level PressureIn  \
count   9617.000000  9617.000000  9616.000000           9617.000000   
mean      62.349756    50.201497    70.014247             30.040644   
std       15.410849   103.758176    19.626160              0.197613   
min       19.400000 -9999.000000    16.000000             29.290000   
25%       50.000000    37.900000    54.000000             29.910000   
50%       64.000000    54.000000    73.000000             30.040000   
75%       73.900000    64.900000    88.000000             30.170000   
max      105.100000    75.900000   100.000000             30.690000   

       VisibilityMPH  PrecipitationIn  WindDirDegrees  
count    9617.000000      1169.000000     9617.000000  
mean        9.062036         0.053336      118.418426  
std         2.323571         0.118034      111.996815  
min         0.100000         0.000000        0.000000  
25%        10.000000         0.000000        0.000000  
50%        10.000000         0.010000       90.000000  
75%        10.000000         0.050000      230.000000  
max        10.000000         1.420000      360.000000  

Plotting

Visual Data is much more easily understandable. To make data more visual, we will now plot the dataframes in differenty types of charts.

Univariate numeric plotting

The first type of plot is a univariate numeric plot. We analyze different numeric columns by using different graphs like the histogram, the density plot, and the box plot.

In [16]:
%matplotlib inline
df2012[['Humidity']].hist(alpha=0.5, bins=5) # plot histogram of humidity
Out[16]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000000008BE76D8>]], dtype=object)
In [17]:
df2012[['TemperatureF']].plot(kind="kde") # plot density plot of temperature
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0x9524e10>
In [18]:
df2012['Humidity'].plot(kind="box") # plot box plot of humidity
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0x95245f8>

Univariate Categorial plotting

The second type of plot is a univariate categorical plot. We analyze different categorical columns by using different graphs like the barplot and the pie chart.

In [27]:
df2012['Conditions'].value_counts().plot(kind='bar') # plot bar plot of conditions
Out[27]:
<matplotlib.axes._subplots.AxesSubplot at 0xca44ef0>
In [29]:
df2012['Conditions'].value_counts().plot(kind='pie', figsize=(7,7)) # plot pie chart
Out[29]:
<matplotlib.axes._subplots.AxesSubplot at 0xcf12a90>

Bivariate numeric plotting

The third type of plot is a bivariate numeric plot. We analyze two different numeric columns and see if there is correlation between the two variables by using the scatterplot.

In [30]:
df2012.plot(kind="scatter", x='Humidity', y='TemperatureF', s=2) # plot scatterplot of Humidity against Temperature.
Out[30]:
<matplotlib.axes._subplots.AxesSubplot at 0xd6687b8>

Bivariate: categorical and numeric plotting

The fourth type of plot is a bivariate categorical and numeric plot. We analyze one categorical and one numeric column and see if there is correlation between the two variables by using either multi boxplot or multi density plot.

In [33]:
import matplotlib.pyplot as plt
df2012.boxplot(column='TemperatureF', by='Conditions')# plot multiboxplot of Temperature and Conditions
Out[33]:
<matplotlib.axes._subplots.AxesSubplot at 0xe0c59e8>
In [35]:
def plotdensity(df_var):
    df_var['TemperatureF'].plot(kind='kde')
    
df2012.loc[0:700,['TemperatureF', 'Conditions']].groupby(['Conditions']).apply(plotdensity)
# Plot multi density plot of Temperature against Conditions
Out[35]: