tensorflow
Back to home page
Installation problem
If even after installing tensorflow, you get the error as No module named 'tensorflow'
,
then it is because you haven’t run the following command:
For python 3.x pip3 install --upgrade tensorflow
Please note that at present(6-Jan-2018) tensorflow does not support 3.6. So it is advised to install 3.5.x to run your progrmas.
For python 2.7 pip install --upgrade tensorflow
The above command will install/upgrade the wheels.
Variable
In tensorflow, the model parameters are defined as varaibles (as mentioned below). Note
that these can be used and even modified by the computation. We have to pass the intial
value of the tf.Variable
an example is shown below.
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
Important note
Before Variables can be used within a session, they must be initialized using that session. This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each Variable. This can be done for all Variables at once:
sess.run(tf.global_variables_initializer())
for tensorflow 1.x
sess.run(tf.initialize_all_variables())
for tensorflow 0.x
Placeholders
In tensorflow, a placeholder
is a value which we will input when we excute the code
in tensorflow. The difference between tf.Variable
and tf.placeholder
is that you
will have to provide the value for tf.Variable
it is mostly assigned to the variable
such as weights or baises. But tf.placeholder
is used for the training data (ref)
A sample code (from: tensorflow.org):
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
Here, shape defines the shape of the tensor. Where 1st argument (of shape
) explaing
the number of rows, and 2nd the no. of columns. None
is like a dynamic input of the
training examples. The shape
argument to placeholder is optional, but it allows
TensorFlow to automatically catch bugs stemming from inconsistent tensor shapes (ref)
Importing Features
This ia an important part because of the type of data types in the features. TF provides a seperate module called feature_column
and the entire documentation can be accessed here.
Model selection for our Neural network
To access or create a model you use an Estimator
class. (ref)
Pre-defined models
Example model DNNClassifier:
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[10, 10],
n_classes=3)
-
feature_columns as the name suggest is the matrix containing all the features.
-
As you can see a parameter called
hidden_units
which takes a list. Each element of this list is the # of neurons in a layer and total # of layers being the length of this list. -
n_class is the number of possible values our network can predict. So for example, if we have classifier for flowers (as in the case of tensorflow tutorial ref) the there are 3 possible outcomes of the network.
-
An optional parameter of this estimator is called optimizer: basic details can be found here (I will update this section as I move forward with the tutorials)
Custom estimators ref
Training the model
By defining the model we have the basic structure ready. Now our task is to train the neural network. This can be thought of in terms of sklearn as follows
Model creeation (sklearn):
reg_var = linear_model.LinearRegression(fit_intercept=True,
normalize=True,
copy_X=True,
n_jobs=1)
Model training (sklearn)
reg_var.fit(X_train,Y_train)
Model creation (tensorflow)
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[10, 10],
n_classes=3)
Model training (tensorflow)
In tensorflow much like in sklearn we will use the following syntax:
classifier.train(
input_fn=lambda:train_input_fn(train_feature, train_label, args.batch_size),
steps=args.train_steps)
steps
hyperparameter is equivalent to number of iterations. default = 1000
Note: As mentioned on the tensflow Tutorials more number of iterations doesnt guartantee a better model.
input_fn
‘parameter’ identifies the function that will provide the training data (including the batch size). Our input training function has entries, 1st features, 2nd labels, 3rd batch size. Important to note here is the datatype in which tensorflow takes input.
train_feature
is a python dictonary where keys
are names of the feature and value is an array and each value coresponding to key is an array, containing the values for each example in the training set. You dont have to worry about htis and conversion in explained in the below (hint: it uses tf.data.Dataset)
train_label
is an array containing the value of the label for each example.
args.batch_size
is the value of batch size i.e. number of examples used in one iteration. The smaller the number of batch size the faster is the training but with reduced accuracy.
train_input fn
in the example above defined as follows:
def train_input_fn(features, labels, batch_size):
## The following call will convert input features and labels into `tf.data.Dataset`
dataset = tf.data.Dataset.from_tensor_slices((dict (features), labels))
return dataset.make_one_shot_iterator().get_next()
#this return statment passes a batch of examples back to the train method.
If you want to shuffle the data (which is recemonded) then you can use tf.data.Dataset.shuffle(buffer_size=1000)
or in the above case dataset.shufflebuffer_size=1000()
. If buffer_size
> # of examples will ensure good shuffling.
Evaluating the model
In Tensorflow each Estimator
provides an evaluate
method. This can be called as follows
# Evaluate the model.
eval_result = classifier.evaluate(
input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
def eval_input_fn(features, labels=None, batch_size=None):
"""An input function for evaluation or prediction"""
if labels is None:
# No labels, use only features.
inputs = features
else:
inputs = (features, labels)
# Convert inputs to a tf.dataset object.
dataset = tf.data.Dataset.from_tensor_slices(inputs)
# Batch the examples
assert batch_size is not None, "batch_size must not be None"
dataset = dataset.batch(batch_size)
# Return the read end of the pipeline.
return dataset.make_one_shot_iterator().get_next()
The difference between classifier.train
and classifier.evaluate
is the data inserted. In classifier.train
we provide the training data X_train
and Y_train
and in classifier.evaluate
we provide X_test
, Y_test
.
Prediction using learned model
As in the case of evaluation (i.e. error our model has) each model (which we have stored in classifier) has prediction method as well. We will use the function created in while evaluating here as well.
predictions = classifier.predict(
input_fn=lambda:eval_input_fn(X_predict, batch_size=args.batch_size))
Our X_predict can be imported from a variety of sources and will be converted to suitable form (dictanaries) by tf.data.Dataset
.
The
predict
method returns a python iterable, yielding a dictionary of prediction results for each example. This dictionary contains several keys. Theprobabilities
key holds a list of floating-point values, each representing the probability that the input example is a particular label.'probabilities': array([ 1.19127117e-08, 3.97069454e-02, 9.60292995e-01])
, here 3rd element (index 2) is post probable The class_ids key holds a one-element array that identifies the most probable species.'class_ids': array([2])