1.41-Biological inspiration of Artificial Intelligence

Objectives: 1.41-Biological inspiration of Artificial Intelligence

Biological Inspiration of Artificial Intelligence — Full Notes (English with Swahili)

Biological Inspiration of Artificial Intelligence

Deep, practical notes with formulas, symbol meanings, real-world examples, and inline drawings. (Maelezo ya kina pamoja na fomula, maana ya alama, mifano halisi, na michoro.)

Language note: Key technical terms include Swahili glosses in italics next to the English term.

1) Foundations: Neurons, Synapses, and Learning

The core biological unit is the neuron (neuroni). In AI, a neuron computes a weighted sum and passes it through a nonlinearity. A synapse (sinapsi) carries a weight that can strengthen/weaken with experience.

Perceptron-style neuron

Computation: $$z=\sum_{i=1}^n w_i x_i + b,\quad a=\phi(z)$$

  • x_i: input feature (kipengele cha ingizo)
  • w_i: synaptic weight (uzito wa sinapsi)
  • b: bias (mgeuko wa msingi)
  • \phi: activation function (kazi ya uanzishaji) (ReLU, sigmoid, tanh)
x1 x2 x3 w1,w2,w3 Σ + b φ a
Real-world analogy: Weights are like adjustable volume knobs on different microphones feeding a mixer. The activation is the decision of whether the mixed sound is loud enough to trigger an action. (Mfano halisi: Uzito ni kama vifungo vya sauti kwenye maikrofoni tofauti zinazoenda kwenye mchanganyiko; uanzishaji ni uamuzi kama sauti imefikia kiwango cha kuchochea kitendo.)

2) Biologically Inspired Learning Rules

Hebbian Learning (kujifunza kwa Hebbian)

“Cells that fire together wire together.” Update: $$\Delta w_{ij} = \eta\, x_i y_j$$ where $x_i$ is presynaptic activity and $y_j$ postsynaptic activity; $\eta$ is learning rate.

  • \eta: learning rate (kiwango cha ujifunzaji)
  • x_i, y_j: activities (shughuli za neva)
Hebbian: Δw ∝ x·y wij xi yj

Oja’s Rule (kanuni ya Oja)

Normalizes Hebbian growth to prevent divergence: $$\Delta w = \eta\,( y\,x - y^2 w )$$ Leads to principal component extraction.

BCM Rule (kanuni ya BCM)

Sliding threshold $\theta(y)$ controls LTP/LTD: $$\Delta w = \eta\, y\,(y-\theta)\,x.$$ If activity is modest, depress; if high, potentiate.

STDP (uenezaji wa muda—Spike-Timing-Dependent Plasticity)

Weight change depends on spike timing $\Delta t = t_{post}-t_{pre}$: $$\Delta w = \begin{cases} A_+\,e^{-\Delta t/\tau_+}, & \Delta t>0\\ -A_-\,e^{\Delta t/\tau_-}, & \Delta t<0 \end{cases}$$

  • A_+, A_-: max potentiation/depression
  • \tau_+, \tau_-: time constants (muda bainishi)
Δt Δw
Real-world analogy: STDP is like teaching a dance duo; if the lead’s cue comes just before the partner’s move, the connection strengthens. If the cue lags, the partnership weakens. (Mfano: Kama wanadansi—ishara sahihi kabla ya mwitikio huimarisha uhusiano; ukichelewa, hudhoofisha.)

3) Spiking Neurons & Neural Coding

Leaky Integrate-and-Fire (LIF) (mtindo wa kuvuja na kuunganishwa)

Membrane potential $V$ obeys $$\tau_m \frac{dV}{dt} = - (V - V_{rest}) + R I(t).$$ When $V$ crosses $V_{th}$, neuron emits a spike and $V\to V_{reset}$.

  • \tau_m: membrane time constant (muda wa utando)
  • R: membrane resistance (upinzani)
  • I(t): input current (mkondo wa ingizo)
t V

Neural Codes

  • Rate code (msongamano wa misukumo): information in average firing rate.
  • Temporal code (msimbo wa muda): information in precise spike times.
  • Population code (msimbo wa kundi): information across groups of neurons.

Design hint: Spiking NN can be energy-efficient on neuromorphic chips, useful for event-based sensors like dynamic-vision sensors (DVS). (Vidokezo: Mitandao ya kusukuma misukumo ni yenye ufanisi nishati kwenye vifaa neuromorphic.)

4) From Brains to Architectures

CNNs & Visual Cortex (gamba la kuona)

Convolution models receptive fields (maeneo yanayopokelea) and hierarchy (simple→complex cells).

Feature map: $$h_{k}(u,v)=\sigma\!\Big(\sum_{c} (w_{k,c} * x_c)(u,v) + b_k\Big)$$

RNNs & Recurrence

Echo cortical loops and working memory. $$h_t=\phi(W_{hh}h_{t-1}+W_{xh}x_t+b_h).$$ LSTM/GRU add gates analogized to inhibitory control.

Attention & Transformers

Inspired by selective attention (umakini teule). $$\text{Attn}(Q,K,V)=\text{softmax}\!\left(\frac{QK^\top}{\sqrt{d_k}}\right)V$$

Input 3×3 kernel Feature Map

Analogy: CNNs scan with a small stencil like a chef tasting tiny samples across a large soup pot. (Mfano: CNNs ni kama kijiti cha kuonja supu sehemu kwa sehemu.)

5) Reinforcement Learning & Dopamine Reward Prediction Error

Phasic dopamine resembles the reward prediction error (RPE) (kosa la utabiri wa tuzo). In TD learning: $$\delta_t = r_t + \gamma V(s_{t+1}) - V(s_t),\quad V\leftarrow V + \alpha\,\delta_t \nabla V.$$

  • \gamma: discount factor (kipunguzio cha wakati)
  • \alpha: step size (kiwango cha hatua)
  • V(s): value function (kazi ya thamani)
time cue reward

Reality check: Animals shift dopamine bursts from unexpected reward to earlier predictive cues—mirroring TD learning. (Wanyama huhamisha msukumo wa dopamini kutoka zawadi isiyotarajiwa hadi ishara ya mapema.)

6) Predictive Coding & Energy-Based Views

Brains may minimize prediction error (kupunguza kosa la utabiri). Layer $l$ maintains representation $z^{(l)}$ and predicts the layer below; errors flow upward.

Objective (schematic): $$\mathcal{L}=\sum_l \|\epsilon^{(l)}\|^2,\quad \epsilon^{(l)} = z^{(l-1)} - f_l(z^{(l)}).$$

Energy-based models define energy $E(x,y)$ and learn to assign lower energy to correct pairs. Contrastive learning approximates gradients by pushing positive pairs together and negatives apart.

Analogy: Like a weather app constantly correcting its forecast using new sensor readings. (Mfano: kama programu ya hali ya hewa inavyorekebisha utabiri wake.)

7) Reservoir & Neuromorphic Computing

Echo State / Liquid State Machines

Keep a large, fixed recurrent pool (hifadhi) with rich dynamics; train only a readout.

Reservoir state $$r_t=\phi(W_r r_{t-1}+W_{in} x_t)$$ and output $$y_t=W_{out} r_t.$$

Neuromorphic Hardware

  • Event-driven spikes (asynchronous) conserve energy.
  • Local learning (e.g., STDP) reduces memory traffic.
  • Co-location of memory & compute (pembejeo na hesabu pamoja).
Reservoir Random recurrent core → train readout

8) Evolutionary & Swarm Intelligence

Genetic Algorithms (GA) (algoriti za kijeni)

Population-based search using selection, crossover, mutation.

Fitness-proportionate selection: $$p_i = \frac{f_i}{\sum_j f_j}.$$

Particle Swarm Optimization (PSO) (mzinga wa chembe)

Velocity update: $$v_i \leftarrow \omega v_i + c_1 r_1 (p_i - x_i) + c_2 r_2 (g - x_i)$$ then position $x_i\leftarrow x_i+v_i$.

Ant Colony Optimization (ACO) (mchakato wa siafu)

Pheromone update (schematic): $$\tau_{ij} \leftarrow (1-\rho)\,\tau_{ij} + \sum_{k}\Delta \tau^k_{ij}.$$

Analogy: Many simple agents explore and share hints (pheromones/best positions), collectively finding good solutions. (Wakala wengi wadogo hushirikiana kutafuta suluhisho.)

9) Symbols, Notation & Quick Reference

SymbolMeaningSwahili
x, x_iInput / presynaptic activityIngizo / shughuli ya kabla ya sinapsi
w, w_iSynaptic weight / parameterUzito wa sinapsi / kigezo
bBias termMgeuko wa msingi
a, yNeuron activation / outputUanzishaji / pato
\etaLearning rateKiwango cha ujifunzaji
\alpha, \betaStep-size / decay constantsKiwango cha hatua / mgawo wa upungufu
\gammaDiscount factorKipunguzio cha wakati
\phi, \sigmaActivation / nonlinearityKazi ya uanzishaji
\tauTime constantMuda bainishi
\deltaPrediction errorKosa la utabiri
Tip: When reading formulas, map each symbol to a physical intuition (flow, resistance, inertia). This builds problem-solving “gut feel.” (Fikiria alama kama vitu halisi—mtiririko, upinzani, uzito.)

10) Worked Examples & Mini Labs

// Pseudocode (English with Swahili comments)
initialize w ← random unit vector
for each sample x:
  y ← wᵀx                        // activation (uanzishaji)
  w ← w + η * (y*x − y²*w)       // Oja’s update (kanuni ya Oja)
  w ← w / ||w||                  // renormalize (weka sawa)
return w  // approximates first principal component (sehemu kuu ya kwanza)

Interactive plot for $\Delta w$ vs $\Delta t$ with $A_+=A_-=1$, $\tau_+=\tau_-=20$ ms.

// Minimal TD(0) for a chain of states
V = zeros(N)
α = 0.1; γ = 0.95
for episode in 1..E:
  s = start
  while not terminal(s):
    a = policy(s)            // could be random (nasibu)
    s2, r = step(s, a)
    δ = r + γ*V[s2] - V[s]
    V[s] += α * δ
    s = s2
return V

Reference Book: N/A

Author name: SIR H.A.Mwala Work email: biasharaboraofficials@gmail.com
#MWALA_LEARN Powered by MwalaJS #https://mwalajs.biasharabora.com
#https://educenter.biasharabora.com

:: 1::