The above image was made with stable diffusion using the prompt 'a multilayer perceptron holding hands with a person colorful.'
I've been thinking about a question lately. The question of using AI to create AI. Multilayer perceptrons are rudimentary neural networks, like the most basic forms of artificial intelligence. So I asked ChatGPT to write a multilayered perceptron in Python. My initial input was to 'Construct a multilayer perceptron in Python code.' This produced the following output:
from keras.models import Sequential
from keras.layers import Dense
# create the model
model = Sequential()
model.add(Dense(10, input_dim=4, activation='relu'))
# compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# train the model
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
ChatGPT then explained the code:
In this example, we create a multilayer perceptron with two hidden layers, each with a rectified linear unit (ReLU) activation function, and a final output layer with a softmax activation function for classification. The Dense function is used to create fully connected layers. The input_dim parameter specifies the number of input features, and the units parameter specifies the number of neurons in the layer. The compile function sets the loss function, optimizer, and evaluation metrics for training. The fit function trains the model on the input data, with the specified number of epochs and batch size, and validates the performance on a separate validation dataset.
After seeing this code, I didn't want to run it. It relied on Keras, which can be finicky on Anaconda, and I didn't feel like messing around with installation. So I generated two more ChatGPT responses and got variations of the same result. Eventually, I formulated a new query: 'Construct a multilayer perceptron in Python code. Do not use Keras.' Here's what came back:
import numpy as np
# Define the sigmoid activation function
return 1 / (1 + np.exp(-x))
# Define the derivative of the sigmoid activation function
return x * (1 - x)
# Define the multilayer perceptron class
def __init__(self, input_size, hidden_size, output_size):
# Initialize the weights with random values between -1 and 1
self.weights1 = np.random.uniform(low=-1, high=1, size=(input_size, hidden_size))
self.weights2 = np.random.uniform(low=-1, high=1, size=(hidden_size, output_size))
def forward(self, x):
# Forward pass through the network
self.hidden = sigmoid(np.dot(x, self.weights1))
self.output = sigmoid(np.dot(self.hidden, self.weights2))
def backward(self, x, y, output):
# Backward pass through the network
self.output_error = y - output
self.output_delta = self.output_error * sigmoid_derivative(output)
self.hidden_error = self.output_delta.dot(self.weights2.T)
self.hidden_delta = self.hidden_error * sigmoid_derivative(self.hidden)
self.weights1 += x.T.dot(self.hidden_delta)
self.weights2 += self.hidden.T.dot(self.output_delta)
def train(self, x, y, epochs):
# Train the network using the specified number of epochs
for i in range(epochs):
output = self.forward(x)
self.backward(x, y, output)
def predict(self, x):
# Predict the output for a given input
# Create a multilayer perceptron with 2 hidden layers
mlp = MLP(input_size=10, hidden_size=64, output_size=1)
# Generate some example data for training and testing
x_train = np.random.rand(1000, 10)
y_train = np.random.randint(0, 2, size=(1000, 1))
x_test = np.random.rand(100, 10)
y_test = np.random.randint(0, 2, size=(100, 1))
# Train the model on the example data for 10 epochs
mlp.train(x_train, y_train, epochs=10)
# Predict the output for the test data
predictions = mlp.predict(x_test)
# Calculate the accuracy of the predictions
accuracy = np.mean((predictions > 0.5) == y_test)
print("Test accuracy:", accuracy)
ChatGPT then explained the code:
In this example, we define a sigmoid activation function and its derivative, which we will use in our multilayer perceptron class. We define the MLP class, which takes the input size, hidden size, and output size as arguments. In the constructor, we initialize the weights with random values between -1 and 1. The forward method performs a forward pass through the network, computing the hidden and output layers using the sigmoid activation function. The backward method performs a backward pass through the network, computing the output error and delta, as well as the hidden error and delta. The train method trains the network using the specified number of epochs, performing forward and backward passes for each input/output pair in the training data. The predict method predicts the output for a given input using the trained weights. We create a multilayer perceptron with 2 hidden layers, and generate some example data for training and testing. We train the model on the example.
This code successfully ran in a Jupyter notebook and produced the following output: 'Test accuracy: 0.61'
My intention with this experiment was to make a toy to see if it could be done, and how easily. I discovered that using AI to create neural networks is dirt simple. I could have done it even if I had no knowledge of code. That has big implications.
The code I ended up with seemed properly formed and well commented. I haven't gone over it with a fine-tooth comb and there could be problems that aren't immediately apparent, but it definitely proves the concept I was exploring. We are now at the point where AI can write AI, though it isn't clear if AI can write better AI than humans yet.
This is philosophically interesting, but I wonder about practical applications. As a data science hobbyist, I can envision situations where ChatGPT could shave a few hours off of a project by generating the first draft of a script. But that script would still have to be critically evaluated, which also takes time. The most interesting possibility is that the program might produce a result that changes how I think about a problem. In this way, I see the new tech as a compliment to creativity.
There are indications that the next generation of these programs are exhibiting early forms of Artificial General Intelligence. While these emerging models won't solve the hard problem of consciousness, their capabilities are likely to be stunning. I'm very interested in what'll happen when these intelligences start interacting with the internet directly. Imagine the insights that might come from asking AI to evaluate media or public policy.
Read my novels:
- Small Gods of Time Travel is available as a web book on IPFS and as a 41 piece Tezos NFT collection on Objkt.
- The Paradise Anomaly is available in print via Blurb and for Kindle on Amazon.
- Psychic Avalanche is available in print via Blurb and for Kindle on Amazon.
- One Man Embassy is available in print via Blurb and for Kindle on Amazon.
- Flying Saucer Shenanigans is available in print via Blurb and for Kindle on Amazon.
- Rainbow Lullaby is available in print via Blurb and for Kindle on Amazon.
- The Ostermann Method is available in print via Blurb and for Kindle on Amazon.
- Blue Dragon Mississippi is available in print via Blurb and for Kindle on Amazon.
See my NFTs:
- Small Gods of Time Travel is a 41 piece Tezos NFT collection on Objkt that goes with my book by the same name.
- History and the Machine is a 20 piece Tezos NFT collection on Objkt based on my series of oil paintings of interesting people from history.
- Artifacts of Mind Control is a 15 piece Tezos NFT collection on Objkt based on declassified CIA documents from the MKULTRA program.