I know 3 high level api for deep learning. They are Tensorflow.contrib.learn(SKFlow), TFLearn and Keras. All of them are great tools, but maybe I like Keras because of the easy style of code.
When I came through the Tensorflow Tutorial of Deep and Wide Learning, I felt I can translate the code from tf.contrib.learn into Keras.
In Tensorflow’s tutorial, they said that, using Wide&Deep Model together, they could improve the accuracy from 83.6% to about 84.4%. After my Keras code was finished, I found that the accuracy of my new model is more than 85%. Although I used more units in the model, that’s still nice I think!~
Here is my code for Keras:
from keras.models import Sequential from keras.layers import Dense from sklearn.preprocessing import MinMaxScaler import numpy as np import pandas as pd COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num", "marital_status", "occupation", "relationship", "race", "gender", "capital_gain", "capital_loss", "hours_per_week", "native_country", "income_bracket"] LABEL_COLUMN = "label" CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", "relationship", "race", "gender", "native_country"] CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", "hours_per_week"] def load_df(filename): with open(filename, 'r') as f: df = pd.read_csv(f, names = COLUMNS, skipinitialspace=True, engine='python') return df def preprocess(df): df[LABEL_COLUMN] = (df['income_bracket'].apply(lambda x: ">50K" in x)).astype(int) df.pop("income_bracket") y = df[LABEL_COLUMN].values df.pop(LABEL_COLUMN) # This makes One-Hot Encoding: df = pd.get_dummies(df, columns=[x for x in CATEGORICAL_COLUMNS]) # This makes scaled: df = pd.DataFrame(MinMaxScaler().fit_transform(df), columns=df.columns) X = df.values return X,y def main(): df_train = load_df('data/traindata') df_test = load_df('data/testdata') train_len = len(df_train) df = pd.concat([df_train, df_test]) X, y = preprocess(df) X_train = X[:train_len] y_train = y[:train_len] X_test = X[train_len:] y_test = y[train_len:] model = Sequential() model.add(Dense(1024, input_dim=X_train.shape, activation='relu')) model.add(Dense(1024)) model.add(Dense(1, activation='sigmoid')) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy']) model.fit(X_train, y_train, nb_epoch=10, batch_size=32 ) loss, accuracy = model.evaluate(X_test, y_test) print("") print(accuracy) if (__name__=='__main__'): main()
During coding, I found preprocessing is very important:
- By using One-Hot Encoding, I even don’t need to treat Wide and Deep model separately. I just one-hot encode the categorical columns, and they can be feed together with continuous columns.
- By using MaxMinScaler, I can avoid the result goes to one side at all. For example, if I don’t use scaler, my prediction should be all 0s or 1s, because some columns have very big numbers, the model will ignore other columns’ effect.
And thanks to sklearn and pandas, Keras don’t need to do those preprocessing at all, which Tensorflow is doing these staff inside the model. I think it’s better to do that before we feed the data, so it will be much clear what we are feeding to the neural network.