https://github.com/anty-filidor/knapsack-problem

Comparison of linear programming and heuristic as a case study of knapsack problem
https://github.com/anty-filidor/knapsack-problem

experiment heuristics linear-programming python

Last synced: 11 months ago
JSON representation

Comparison of linear programming and heuristic as a case study of knapsack problem

Host: GitHub
URL: https://github.com/anty-filidor/knapsack-problem
Owner: anty-filidor
Created: 2022-01-27T22:08:18.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2022-01-30T21:31:37.000Z (over 4 years ago)
Last Synced: 2025-02-21T14:24:28.640Z (over 1 year ago)
Topics: experiment, heuristics, linear-programming, python
Language: Python
Homepage:
Size: 163 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Knapsack problem - linear programming vs. naive heuristic

## Problem

We need to pack a knapsack which has certain capacity (_C_ - e.g. 20 kg). There
are several (_n_) items that can be loaded into it. Each item has its weight
(_w_ - e.g. 2 kg) and value (_p_ - e.g. 30 €). The task is to choose a subset
of items that (1) can be stored inside knapsack, so that it is not overloaded
and (2) their total value is maximal.

We can write the problem with following equations:

Model equation

This problem can be solved with multiple approaches. We selected two methods
and, after they had been implemented, we compared their efficiency.

## Methods

### Linear programming

First method was implemented using linear programming approach. We just
rewrote equations into the syntax of the selected library
([pulp](https://coin-or.github.io/pulp/index.html)). We used
[CBC](https://projects.coin-or.org/Cbc) solver to tackle equations. As a result
we obtained function _knapsack_gd_ in the script
[linprog_method.py](./linprog_method.py).

It's good to mention that this approach guaranteed to obtain optimal solution
for the problem.

### Naive heuristic

Second method was a greedy heuristic. We borrowed some ideas from
[article](https://www.pythonpool.com/knapsack-problem-python/) available
online and improved it. The operation principle was to (1) compute 'usability'
for each item (as its _p/w_), (2) sort all items descending by that feature and
(3) pack item by item to the knapsack following obtained sequence until it is
full. This method has been implemented as function _knapsack_gd_ in the script
[greedy_method.py](./greedy_method.py).

Of course this method, as all heuristics, didn't guarantee to get optimal
solution.

## Setting up experiment

To answer the question "which of two methods is better" we prepared an
experiment to compare (1) time efficiency and (2) correctness of algorithms as
a function of data complexity.

### Data generator

As first step, a utility that generates data with varying 'complexity' was
implemented. Every individual of created dataset had to meet following
conditions:

- sum of weights of items must be smaller than _1.3C_
- each item can have weight from range _[1, 10]_
- each item can have value from range _[1, 10]_

The parameter that determined a 'complexity' of the data was _C_, such that
_C_ of *j*th generated tuple is _50_ units higher than _C_ of *j-1*th generated
tuple.

Function that generates data can be found in
[data_generator.py](./data_generator.py) script.

### An 'engine'

In order to compute results, we prepared a dedicated set of functions
([timer.py](./timer.py)). Their responsibility was to measure time of execution
of each evaluated method for varying input data, check if result obtained for
greedy approach is optimal, and visualise experiment on the chat.

## Results and discussion

We ran several experiments for different ranges of _C_. As a result, an
interesting phenomena has been observed for small and relatively big values of
_C_ (see figures below):

Results for small C

Results for big C

Analysis of the charts can tell us that greedy implementation is much faster
than its linear counterpart. Here, it's good to mention, that function
_knapsack_gd_ wasn't prepared to meet high-efficiency condition.

We can also notice, that _knapsack_gd_ is almost accurate as linear programming
implementation for small _C_. However, for higher data complexity it gets less
precise, so that it cannot find optimal solution!

## Conclusions

Experiment that has been conducted using this codebase was intended to show
differences in solving the knapsack problem with linear programming framework
and heuristic. Without surprises, we were able to see, that those two,
competitive methods have their own advantages and drawbacks. There is a
trade-off between time efficiency and precision of obtained result. Therefore,
it's very important to determine the exact nature of the problem before
implementation. When it comes to accuracy, linear programming wins, but when
speed is the main goal, and we don't care if the result will be optimal -
greedy method should be chosen.

## How to run the code

- create environment (`conda create --name knapsack python=3.9`)
- activate environment (`conda activate knapsack`)
- install required libraries (`pip install -r requirements.txt`)
- run `main.py` (`python main.py`)

## References

\[1\] https://coin-or.github.io/pulp/index.html
\[2\] https://www.pythonpool.com/knapsack-problem-python/
\[3\] https://realpython.com/linear-programming-python/#linear-programming-python-implementation
\[4\] https://projects.coin-or.org/Cbc

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anty-filidor/knapsack-problem

Awesome Lists containing this project

README