Using AI to Create AI

in #ai6 months ago


The above image was made with stable diffusion using the prompt 'a multilayer perceptron holding hands with a person colorful.'

I've been thinking about a question lately. The question of using AI to create AI. Multilayer perceptrons are rudimentary neural networks, like the most basic forms of artificial intelligence. So I asked ChatGPT to write a multilayered perceptron in Python. My initial input was to 'Construct a multilayer perceptron in Python code.' This produced the following output:

from keras.models import Sequential
from keras.layers import Dense

# create the model
model = Sequential()
model.add(Dense(10, input_dim=4, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(3, activation='softmax'))

# compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# train the model, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))

ChatGPT then explained the code:

In this example, we create a multilayer perceptron with two hidden layers, each with a rectified linear unit (ReLU) activation function, and a final output layer with a softmax activation function for classification. The Dense function is used to create fully connected layers. The input_dim parameter specifies the number of input features, and the units parameter specifies the number of neurons in the layer. The compile function sets the loss function, optimizer, and evaluation metrics for training. The fit function trains the model on the input data, with the specified number of epochs and batch size, and validates the performance on a separate validation dataset.

After seeing this code, I didn't want to run it. It relied on Keras, which can be finicky on Anaconda, and I didn't feel like messing around with installation. So I generated two more ChatGPT responses and got variations of the same result. Eventually, I formulated a new query: 'Construct a multilayer perceptron in Python code. Do not use Keras.' Here's what came back:

import numpy as np

# Define the sigmoid activation function
def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Define the derivative of the sigmoid activation function
def sigmoid_derivative(x):
return x * (1 - x)

# Define the multilayer perceptron class
class MLP:
def __init__(self, input_size, hidden_size, output_size):
# Initialize the weights with random values between -1 and 1
self.weights1 = np.random.uniform(low=-1, high=1, size=(input_size, hidden_size))
self.weights2 = np.random.uniform(low=-1, high=1, size=(hidden_size, output_size))

def forward(self, x):
# Forward pass through the network
self.hidden = sigmoid(, self.weights1))
self.output = sigmoid(, self.weights2))
return self.output

def backward(self, x, y, output):
# Backward pass through the network
self.output_error = y - output
self.output_delta = self.output_error * sigmoid_derivative(output)
self.hidden_error =
self.hidden_delta = self.hidden_error * sigmoid_derivative(self.hidden)
self.weights1 +=
self.weights2 +=

def train(self, x, y, epochs):
# Train the network using the specified number of epochs
for i in range(epochs):
output = self.forward(x)
self.backward(x, y, output)

def predict(self, x):
# Predict the output for a given input
return self.forward(x)

# Create a multilayer perceptron with 2 hidden layers
mlp = MLP(input_size=10, hidden_size=64, output_size=1)

# Generate some example data for training and testing
x_train = np.random.rand(1000, 10)
y_train = np.random.randint(0, 2, size=(1000, 1))
x_test = np.random.rand(100, 10)
y_test = np.random.randint(0, 2, size=(100, 1))

# Train the model on the example data for 10 epochs
mlp.train(x_train, y_train, epochs=10)

# Predict the output for the test data
predictions = mlp.predict(x_test)

# Calculate the accuracy of the predictions
accuracy = np.mean((predictions > 0.5) == y_test)
print("Test accuracy:", accuracy)

ChatGPT then explained the code:

In this example, we define a sigmoid activation function and its derivative, which we will use in our multilayer perceptron class. We define the MLP class, which takes the input size, hidden size, and output size as arguments. In the constructor, we initialize the weights with random values between -1 and 1. The forward method performs a forward pass through the network, computing the hidden and output layers using the sigmoid activation function. The backward method performs a backward pass through the network, computing the output error and delta, as well as the hidden error and delta. The train method trains the network using the specified number of epochs, performing forward and backward passes for each input/output pair in the training data. The predict method predicts the output for a given input using the trained weights. We create a multilayer perceptron with 2 hidden layers, and generate some example data for training and testing. We train the model on the example.

This code successfully ran in a Jupyter notebook and produced the following output: 'Test accuracy: 0.61'

My intention with this experiment was to make a toy to see if it could be done, and how easily. I discovered that using AI to create neural networks is dirt simple. I could have done it even if I had no knowledge of code. That has big implications.

The code I ended up with seemed properly formed and well commented. I haven't gone over it with a fine-tooth comb and there could be problems that aren't immediately apparent, but it definitely proves the concept I was exploring. We are now at the point where AI can write AI, though it isn't clear if AI can write better AI than humans yet.

This is philosophically interesting, but I wonder about practical applications. As a data science hobbyist, I can envision situations where ChatGPT could shave a few hours off of a project by generating the first draft of a script. But that script would still have to be critically evaluated, which also takes time. The most interesting possibility is that the program might produce a result that changes how I think about a problem. In this way, I see the new tech as a compliment to creativity.

There are indications that the next generation of these programs are exhibiting early forms of Artificial General Intelligence. While these emerging models won't solve the hard problem of consciousness, their capabilities are likely to be stunning. I'm very interested in what'll happen when these intelligences start interacting with the internet directly. Imagine the insights that might come from asking AI to evaluate media or public policy.

Read my novels:

See my NFTs:

  • Small Gods of Time Travel is a 41 piece Tezos NFT collection on Objkt that goes with my book by the same name.
  • History and the Machine is a 20 piece Tezos NFT collection on Objkt based on my series of oil paintings of interesting people from history.
  • Artifacts of Mind Control is a 15 piece Tezos NFT collection on Objkt based on declassified CIA documents from the MKULTRA program.

Wow! 80% of this post was mostly foreign to me, yet I deeply enjoyed being mystified. I also love that you call yourself a data scientist hobbyist :) When it comes to AI evaluating media or public policy, what do you envision? Also, what do you think would be the greatest realistic risk of AI becoming more sophisticated? No need to respond in depth if these are too big to answer via a Hive comment!

AI can read media reports and detect biases/misleading information, with the potential to make propaganda more transparent. AI could serve policy by reading a million little environmental signals for a variety of reasons, from understanding risk in a given situation to recommending interest rate adjustments.

The greatest near-term risk of sophisticated AI in my opinion is that it's likely to be used by the powerful to further consolidate their power. Good AI could already make a UFO hoax look believable.

The rewards earned on this comment will go directly to the people sharing the post on Twitter as long as they are registered with @poshtoken. Sign up at