Hi Guys in this blog post i am trying to explain, how we can implement simple linear regression with tensorflow. Simple linear regression means regression with single input and single output.

In future posts i will explain about.

1. Linear regression with multiple features.

2. Polynomial regression.

3. Regularized/Normalized linear regression.

4. Linear regression with external data.

Today anyway i want keep the problem simple, so i am using some inline data for analysis.

I hope you have some good knowledge about Machine Learning. If not please take a training from course-era. Course-era has a good training in Machine Learning by Andrew Ng.

Do we really need tensorflow to do linear regression? We can implement it in Octave, MATLAB, Python, Scikit-learn etc…

Understanding the Math’s and statics behind linear regression is more important than the tool we are going to use.

So how we will approach this problem?

First we need to ensure the data available with us is linearly dependent. For that we need to plot it. You can use the below code to plot your data.

import matplotlib.pyplot as plt
import numpy
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.legend()
plt.show()

From the image it is clear that the data is linearly dependent and we can use simple linear regression with it.

Next we are going to define our hypothesis, Cost function and Optimizer. Hope the reader is aware of what you mean by cost, how to minimize the cost etc..

For linear regression the hypothesis we are going to use is the equation of a straight line.

hypothesis (prediction)=WX+b(Where W is the slope and b is the y intercept and X is our input).

In tensorflow we say them as Weight and bias (b).

Next what is cost, cost is actually the difference from the actual to predicted. We need to minimize this cost to find a best W and b.

The cost function for linear regression is given below.

To minimize the cost we are going to use gradient descent algorithm.

The full code is given below.

from __future__ import print_function
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
rng = numpy.random
learning_rate = 0.01
training_epochs = 1000
display_step = 50
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]
X = tf.placeholder("float")
Y = tf.placeholder("float")
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
hypothesis = tf.add(tf.multiply(X, W), b)
cost = tf.reduce_mean(tf.square(hypothesis - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
sess.run(optimizer, feed_dict={X: train_X, Y: train_Y})
if (epoch+1) % display_step == 0:
c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))
print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')
# Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
# Testing example, as requested (Issue #2)
test_X = numpy.asarray([6.83, 4.668, 8.9, 7.91, 5.7, 8.7, 3.1, 2.1])
test_Y = numpy.asarray([1.84, 2.273, 3.2, 2.831, 2.92, 3.24, 1.35, 1.03])
print("Testing... (Mean square loss Comparison)")
testing_cost = sess.run(
cost,
feed_dict={X: test_X, Y: test_Y}) # same function as cost above
print("Testing cost=", testing_cost)
print("Absolute mean square loss difference:", abs(
training_cost - testing_cost))
plt.plot(test_X, test_Y, 'bo', label='Testing data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()