Building a Vertical AI Powerhouse from Used Parts: Wall-Mounted and Wallet-Friendly!
A couple of years ago, in a fit of what I can only describe as “productive procrastination,” I built my own Raspberry Pi Kubernetes Cluster. This DIY adventure taught me more about modern distributed systems than any textbook ever could — and left me with an unhealthy obsession for blinking LEDs.
Fast forward to today, I found myself eager to dive into the world of AI. The plan? Build a powerful AI system from scratch. My goals were twofold:
- Learn the inner workings of AI systems
- Create a rig capable of running Large Language Models (LLMs) and image generation models locally
Why build it locally instead of using cloud services? This approach offers several advantages:
- Complete control over hardware and software
- No ongoing subscription costs
- Enhanced data privacy
- A deeper understanding of AI infrastructure
The obvious solution? A powerful PC with a beefy GPU. But two problems arose: my shoebox sized working room and my weeping wallet. Then, inspiration struck. Why not build a wall-mounted, low-budget ML rig from recycled components?
And so, armed with a mission, a dusty 2012 gaming PC from my basement, and a disregard for conventional computer placement, I embarked on creating a wall-mounted budget AI powerhouse. It’s eco-friendly, wallet-friendly, and might just make me look like a tech wizard.
Designing My Vertical AI Beast
Here’s how I tackled the unique challenges of wall-mounting an AI powerhouse:
- Noise Reduction: ML workstations can generate significant noise under constant load. To mitigate this, I selected components optimized for quiet operation.
- Thermal Management: Cooling is crucial in server design, as nearly 100% of energy consumed is converted to heat and overheating leads to performance throttling. The vertical orientation in a spacious room provides a natural advantage for heat dissipation, effectively using the entire room as a heat sink.
- Airflow Design: Unlike traditional servers with cyclic airflow, I designed the fan configuration to direct air towards the components rather than away. This approach minimizes airflow in my workspace, preventing my hair from doing the Marilyn Monroe during video calls.
- Wall Protection: To prevent heat damage to the wall, I planned for backplates mounted at a distance from the wall surface.
- Power Efficiency: Energy conservation was a priority. I repurposed a 700W PSU from my old PC, which should suffice for the initial build.
- GPU Selection: As the core component for ML tasks, I aimed for a GPU with at least 12GB VRAM to support LLM and image generation operations.
- Budget Constraints: I allocated €500 for this project.
With these considerations in mind, I was ready to transform my wall into a temple of computational glory.
The frame was designed to integrate seamlessly with my existing wall setup, allowing for future expansions.
Finding the sweet spot for wall-mounting was a balance act. I settled on a 20mm gap between components and the aluminum backplate, with another 30mm between the backplate and wall. This setup keeps things cool without turning my room into a tech obstacle course. It’s close enough to make my wall feel special, but far enough to prevent my wallpaper from catching fire. The result? A rig that’s both space-efficient and cable-management friendly.
Component Selection
I planned the whole setup starting from the GPU outwards, as the GPU would be the heart of the system.
GPUs
The question was how to get hold of cheap, but capable GPUs and my web search brought me to a guy that made a GPU comparison, what bang you get for a bug when it comes to a used GPUs. (https://www.youtube.com/watch?v=YiX9p8A7LqE&t=1231s).
So I looked up the best option that was in my budget and searched the web for a shopping opportunity. Since China disallowed the mining of crypto there are a lot of used server GPUs from Asia on the market and I ordered one for 167€ (Original price was 6.000€). In the end I’ve selected the P100 over the M40 because it’s ability to work with half precision which theoretically makes it possible to decrease the size by half for some models. The P40 was initially what I was looking for because it has 24GB of RAM, but the ones I could find where all around 280€ which was to expensive for 8 additional GB.
Next I had to find a solution for the cooling of the GPU. The usual fans used for this type of cards are mounted at the ends to blow the wind through the card. That way you can stick a lot of cards in parallel into a case and have everything very space optimized. The downside of this solution is that the fans are pretty small and therefore make an unimaginable noise. I decided to go with a DIY solution that I found on YouTube. I than ordered two silent fans that I could mount on top of the GPU for 23€.
Motherboard, RAM and CPU
Since I wanted a setup that could support up to two GPUs and also has enough RAM to hold the data for heavy learning jobs in memory, I looked for the components that are compatible with the setup. Since the cards are not planned to be connected by NVLink a (high-speed interconnect technology for NVIDIA GPUs), which can speed up communication, I needed a PCIe slot that has a decent throughput.
At the end, I went with a set of a used ATX X99 workstation motherboard with LGA2011 socket. I bought it in a bundle with an 18 Core Intel Xeon (2686 v4) and 128GB of DDR3 RAM. To buy everything in a bundle was the cheaper option compared to shopping the parts separately and came with a price tag of 280€.
As a cooling option, I decided for a water cooler, since they are usually much less noisy than their air counterparts which added another 86€.
Oh, by the way: You need a button to turn the motherboard on. I had some parts laying around to assemble one. But if you do not, you need two pin cables and a button to connect the wires. You can then figure out how to assemble everything from the manual of your motherboard.
PSU
For the one GPU setup I planned to reuse the 700w PSU from my old PC, so no costs here. But to make sure it really supports my requirements, I made a short calculation:
- GPU: NVIDIA Tesla P100: ~250W
- CPU: Intel Xeon E5–2686 v4 (18 cores): ~145W
- X99 workstation motherboard: ~50W
- RAM: 128GB DDR3: ~20W
- Water cooler: ~10W
- Fans (2 for GPU cooling): ~5W (2.5W each)
- Other components (SSD, miscellaneous): ~20W
Total estimated power consumption: 250W + 145W + 50W + 20W + 10W + 5W + 20W = 500W
That gives me 200W of buffer for any peak consumption wich is double the usual 20% proposed overhead.
Monitor and Peripherals
Unfortunately, the motherboard has no GPU on board, and since the server GPUs have no video output, I needed another small GPU to have some kind of video output to my monitor for €12.
As monitor I had an old 19" laying around and for the mounting I choose the cheapest VESA option I could find for 9€.
Regarding peripherals, I planned to use the machine only with remote controls once everything is set up. And for the setup phase, I used an old mouse and keyboard I had laying around.
Disk
To perform the boot process and transfer the data quickly, it’s vital to use an SSD as the main data source and for very large data sets an HDD so your disk will not run out of space too fast. I could reuse the SSD and HDD from my old PC and still have space on the motherboard to add an additional M.2 SSD if needed.
Assembly Material: Backplates, Screws, Cables, and Mounting
As a mounting solution for the wall, I planned on using some wood (€5) and as spacers for the components, I ordered distance elements for €9.
For screws and dowels, I used what I could find in my workshop.
I ordered the first four backplates for the first level in black aluminum backplates at a specialized online shop to give the whole thing a very special appearance. This added another €37.
The cable management should reuse as many cables as possible from my old PC, so I only had to order few new ones.
In summary we have the following part and price tag:
This makes a sum of €628. Which is 25% over budget, but I wanted the motherboard, CPU and RAM to be capable of running a more modern GPU, like the NVIDIA A100 when I decide to make the next upgrade. Which was more important than to perfectly stick to the budget.
Building the Frame and Mount
To assemble the backplates with the components, I placed the components on top of the backplates, marked the positioning of the screws, and drilled some holes at the marked positions.
First, I positioned the wooden placeholders and marked their positions on the backplate. I drilled holes through the backplate and assembled it to the wooden placeholders on the wall. Then the components could be assembled, using the spacers and previously drilled holes.
The GPU was a bit of a shocker. The video I’d seen regarding adding a fan to a K80 had shown an open design which allows fans that you stick on the front to blow air through the slots of the heatsink. The P100 instead had a closed heatsink design that is beneficial for the server setup, but a showstopper for on top fans.
I tried to cut the metal sheet with my scissors which worked pretty well. Afterwards I removed the glass from the cover case and cut off some more plastic from the case to make space for the air that gets blown through the heatsink by the fans. I then mounted the fans and was ready.
After everything was assembled my wall looked like this and I could start with the software setup.
Software Setup
To setup everything I connected a mouse, keyboard and connected the monitor. Since the Rig is meant to run as a server on the long run those cables will be removed as soon as everything is working as expected.
As often with this older Motherboards the GPU was not detected. The fix is to enable the support of that many VRAM in your BIOS settings. I found it in advanced settings as “Above 4G Decoding” and enabled it.
OS: Ubuntu
The OS needs to be compatible with the GPU and the software I wanted to run, so I used the system that supported all the software and driver I needed and I was very familiar with on the other hand.
To install the OS use an external USB drive and follow the official install guide steps:
https://ubuntu.com/tutorials/install-ubuntu-desktop#1-overview
You can alternatively install an ubuntu version without GUI if you do not need the desktop interface. Later on you will barely need it.
SSH and VNC server for connection
Optional: If you want to control your desktop remotely you can use VCN. I’ve figured out that it’s not the most ergonomic option to stand in front of a wall mounted PC to configure it.
SSH is the way to connect with the rig on the day to day basis. This can be setup using:
sudo apt install openssh-server
Install NVIDIA Driver
Next thing you’ll need NVIDIA compatible drivers and the cuda-toolkit.
sudo apt-get install nvidia-driver-535
sudo apt-get install -y nvidia-cuda-toolkit
Check if the NVIDIA Card is detected correctly:
nvidia-smi
Ok, Nice. Everything was setup correctly. Next step: Put some load on the components.
Our First ML App
As usual with ML tasks I needed Python and Pip
sudo apt install python3
sudo apt install python3-pip
And a proper benchmarking kit which I found on Github: https://github.com/thedatadaddi/BenchDaddi
I tried to benchmark with the BERT model:
git clone https://github.com/thedatadaddi/BenchDaddi
cd BenchDaddi
python bert_train_test.py
Everything seemed to be functioning: The GPU’s power consumption spiked, and after thirty minutes, the rig hadn’t transformed into a giantic fireball.
Ollama
To do something useful with the rig I’ve tried to run an LLM model.
curl -fsSL https://ollama.com/install.sh | sh
ollama run llama3.2
After the download we can query the LLM Llama 3.2 and will get some answer.
To make your local setup appear like ChatGPT you can install Docker and use OpenWebUI with the following command.
sudo docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
The next Level
As with any tech project, the urge to upgrade is ever-present. Here’s what’s on the horizon for our wall-mounted AI powerhouse:
- Why stop at one GPU when you can have two? Or three? — I’d really like to run a 70b model, someday.
- Imagine the rig pulsing with different colors based on its current workload — a visual RGB stripe representation of AI in action.
- Fan Speed Control for noise reduction during less intensive tasks.
- …
Final Thoughts & Lessons Learned
Energy Efficiency and Unexpected Benefits: The AI rig isn’t just a computational powerhouse; it’s also an efficient 500W heater for those chilly winter days. With over 99% of energy converted to heat during operation, it’s a testament to the law of conservation of energy in action. Who knew AI could keep you warm?
Challenges & Lessons Learned: Building this rig was a journey filled with obstacles and revelations. Cable management in a wall-mounted setup proved more challenging than anticipated, teaching me the value of meticulous planning. I also learned that the world of used enterprise hardware is a goldmine for budget builds, offering incredible performance-to-price ratios.
Components and Costs: By carefully selecting components and leveraging the used market, I managed to build a formidable AI rig for just €628. This experience highlighted the importance of prioritizing key components (like the GPU) while finding creative solutions for others. It’s proof that high-performance computing doesn’t always require a high-end budget.
This project demonstrates that with creativity, research, and a willingness to get your hands dirty, you can build powerful systems on a budget. Don’t be afraid to think outside the box — or in this case, outside the traditional PC case!
I’d love to see your custom builds!