Hello everybody
I've long been interested in Paul Cockshott's New Harmony algorithm but am completely unfamiliar with Julia.
Has anyone done any work with it?
My idea has been specifically if we can make more realistic "toy" economies (millions of goods, etc) which we could run in the planning algorithm. Specifically, I wanted to include locations in the form of different products. Say we wanted to model a plan which used Walmart as an example. Their website has 70 million SKUs and they have over 5000 store locations. So we'd need to model 70M * 5k = 350G goods.
Also, I've never used forums before so I apologize if I should have just made a post instead of a topic.
Also, I've never used forums before so I apologize if I should have just made a post instead of a topic.
Nope this is how you do it! : )
Hello everybody
I've long been interested in Paul Cockshott's New Harmony algorithm but am completely unfamiliar with Julia.
Has anyone done any work with it?
My idea has been specifically if we can make more realistic "toy" economies (millions of goods, etc) which we could run in the planning algorithm. Specifically, I wanted to include locations in the form of different products. Say we wanted to model a plan which used Walmart as an example. Their website has 70 million SKUs and they have over 5000 store locations. So we'd need to model 70M * 5k = 350G goods.
Is there a written description of how it works anywhere? Once you know the logic of it, you should be able to implement it in any language. For example, I was able to take Iran Wright's Social Architecture model that was written in Mathematica and recreate it in R based off his write up.
Paul's Harmony algorithm is a type of barrier method for linear programming. I suspect there exists problems for which it has poor convergence. Either way I suggest reading the literature on solving large linear programs. Perhaps we can organize a section on the website listing relevant papers? There's also lots of LP solvers available already. Wikipedia has a big 'ol list here.
As far as implementation goes, one has to be very careful avoiding copying data around, and making proper use of sparse matrices. The Octave implementation of a large scale central path solver I wrote a while back is very careful with this. Perhaps I should upload it somewhere.. Also when you bring time into the mix a lot of coefficients are copies of each other, since technical coefficients don't change. A custom solver could implement that efficiently.
At the kind of scale you're talking about you unavoidably have to start thinking about parallelism. I've tinkered a bit with an MPI implementation but lost interest, partly because it'd be "optimizing in the dark" without real data and a real political system to embed it in.
there is a pdf included with the repo so I will probably be able to figure it out.
Now the only problem is to make a realistic toy economy for testing.
Would anyone have an idea or source for, on average, how many inputs there are for a given product and how many other products a single product is an input for?
I feel that would be the biggest part in getting a realistic time estimate for how long it take to plan an economy with the algo.
Paul's Harmony algorithm is a type of barrier method for linear programming. I suspect there exists problems for which it has poor convergence. Either way I suggest reading the literature on solving large linear programs. Perhaps we can organize a section on the website listing relevant papers? There's also lots of LP solvers available already. Wikipedia has a big 'ol list here.
As far as implementation goes, one has to be very careful avoiding copying data around, and making proper use of sparse matrices. The Octave implementation of a large scale central path solver I wrote a while back is very careful with this. Perhaps I should upload it somewhere.. Also when you bring time into the mix a lot of coefficients are copies of each other, since technical coefficients don't change. A custom solver could implement that efficiently.
At the kind of scale you're talking about you unavoidably have to start thinking about parallelism. I've tinkered a bit with an MPI implementation but lost interest, partly because it'd be "optimizing in the dark" without real data and a real political system to embed it in.
It sounds like a lot of this is above my knowledge level unfortunately. Thank you for the information tho!
Paul's Harmony algorithm is a type of barrier method for linear programming. I suspect there exists problems for which it has poor convergence. Either way I suggest reading the literature on solving large linear programs. Perhaps we can organize a section on the website listing relevant papers? There's also lots of LP solvers available already. Wikipedia has a big 'ol list here.
As far as implementation goes, one has to be very careful avoiding copying data around, and making proper use of sparse matrices. The Octave implementation of a large scale central path solver I wrote a while back is very careful with this. Perhaps I should upload it somewhere.. Also when you bring time into the mix a lot of coefficients are copies of each other, since technical coefficients don't change. A custom solver could implement that efficiently.
At the kind of scale you're talking about you unavoidably have to start thinking about parallelism. I've tinkered a bit with an MPI implementation but lost interest, partly because it'd be "optimizing in the dark" without real data and a real political system to embed it in.
Very interesting. I think a topic about solving linear programs would be a great idea. I'm less familiar with the topic, but I've also been hitting up against computation problems recently as I try to manipulate large datasets. I've been finding that it's essential to work backwards from what you're looking for, decreasing the size of our matrices as much as possible.
there is a pdf included with the repo so I will probably be able to figure it out.
Now the only problem is to make a realistic toy economy for testing.
Would anyone have an idea or source for, on average, how many inputs there are for a given product and how many other products a single product is an input for?
I feel that would be the biggest part in getting a realistic time estimate for how long it take to plan an economy with the algo.
Most real world input output tables, including the OECD ones, Input-Output Tables (IOTs) - OECD use industries rather than products as the units of analysis. From a practical standpoint this makes a lot of sense, since there are fixed costs associated with maintaining production that don't scale with a unit of output, and there are ways of pricing that also aren't associated with per unit of output, but which derive from a variety of different business strategies (subscription vs product sales models for example).
If you were actually planning an economy, your input output tables would have to be more sophisticated, you'd need to distinguish between fixed and variable costs.
We can easily put an upper limit on the inputs required for a good, which is the amount of outputs in the economy. You could probably save time by creating a bundle of common inputs that practically everyone uses and create approximate requirements for each industry proportional to their size and ratio of fixed to variable costs.
I remember I took a look at this a while ago, but it's been a while so I'll have to re-familiarize myself. One issue that slightly bothers me just glancing over it is it has a problem a lot of planning schemes have.
Here's a twitter post I made a while back stating this issue and a potential alternative:
https://twitter.com/analytichegel/status/1550453296157958146
Paul's Harmony algorithm is a type of barrier method for linear programming. I suspect there exists problems for which it has poor convergence. Either way I suggest reading the literature on solving large linear programs. Perhaps we can organize a section on the website listing relevant papers? There's also lots of LP solvers available already. Wikipedia has a big 'ol list here.
As far as implementation goes, one has to be very careful avoiding copying data around, and making proper use of sparse matrices. The Octave implementation of a large scale central path solver I wrote a while back is very careful with this. Perhaps I should upload it somewhere.. Also when you bring time into the mix a lot of coefficients are copies of each other, since technical coefficients don't change. A custom solver could implement that efficiently.
At the kind of scale you're talking about you unavoidably have to start thinking about parallelism. I've tinkered a bit with an MPI implementation but lost interest, partly because it'd be "optimizing in the dark" without real data and a real political system to embed it in.
Very interesting. I think a topic about solving linear programs would be a great idea. I'm less familiar with the topic, but I've also been hitting up against computation problems recently as I try to manipulate large datasets. I've been finding that it's essential to work backwards from what you're looking for, decreasing the size of our matrices as much as possible.
So far storage for the system matrix seems to be the limiting factor. In my experiments I assume that there's 160 non-zeroes per column. This means 160 inputs+outputs per production unit on average.
The latest idea I've had for maximizing the size of the matrix is to use multiple M.2 drives, because motherboards currently max out at 12 TiB of RAM. With something like the X11QPL motherboard you can fit 22 NVMe drives 8 TiB each which with the RAM comes to 188 TiB in total. This translates to around 156,000,000,000 variables.
Paul's Harmony algorithm is a type of barrier method for linear programming. I suspect there exists problems for which it has poor convergence. Either way I suggest reading the literature on solving large linear programs. Perhaps we can organize a section on the website listing relevant papers? There's also lots of LP solvers available already. Wikipedia has a big 'ol list here.
As far as implementation goes, one has to be very careful avoiding copying data around, and making proper use of sparse matrices. The Octave implementation of a large scale central path solver I wrote a while back is very careful with this. Perhaps I should upload it somewhere.. Also when you bring time into the mix a lot of coefficients are copies of each other, since technical coefficients don't change. A custom solver could implement that efficiently.
At the kind of scale you're talking about you unavoidably have to start thinking about parallelism. I've tinkered a bit with an MPI implementation but lost interest, partly because it'd be "optimizing in the dark" without real data and a real political system to embed it in.
Very interesting. I think a topic about solving linear programs would be a great idea. I'm less familiar with the topic, but I've also been hitting up against computation problems recently as I try to manipulate large datasets. I've been finding that it's essential to work backwards from what you're looking for, decreasing the size of our matrices as much as possible.
So far storage for the system matrix seems to be the limiting factor. In my experiments I assume that there's 160 non-zeroes per column. This means 160 inputs+outputs per production unit on average.
The latest idea I've had for maximizing the size of the matrix is to use multiple M.2 drives, because motherboards currently max out at 12 TiB of RAM. With something like the X11QPL motherboard you can fit 22 NVMe drives 8 TiB each which with the RAM comes to 188 TiB in total. This translates to around 156,000,000,000 variables.
is 160 a realistic estimate?
my first thought (which may not be possible) is to split up commodities as follows:
if there are more than X number of inputs needed (say 20) you could have the commodity have Y number of intermediate steps, even if these intermediate steps don't actually reflect anything real.
so if it takes 110 inputs to make a commodity (call it C) you could have inputs 1-20 go into making C_part_a then C_part_a + inputs 21-39 go into making C_part_b . . . until you arrive at the true final commodity, C.
This would increase the size of the io matrix by at least 3 or 5 times but it would be much sparser.
@joe that's an interesting idea. bundling inputs together could probably be taken even further if there are common pairs/correlations. After each iteration you could store the most common bundles as intermediary variables which can then be referenced as a single variable, rather than having to call up the inputs going into the each time.
There have to be at least semi-straightforward ways to do this. Walmart and Amazon centrally plan much of their enterprises and companies have been vertically integrating since capitalism first developed.
I can definitely see my knowledge gaps growing. I'd love to talk to some supply chain management people or industrial planners.
Hello everybody
I've long been interested in Paul Cockshott's New Harmony algorithm but am completely unfamiliar with Julia.
Has anyone done any work with it?
My idea has been specifically if we can make more realistic "toy" economies (millions of goods, etc) which we could run in the planning algorithm. Specifically, I wanted to include locations in the form of different products. Say we wanted to model a plan which used Walmart as an example. Their website has 70 million SKUs and they have over 5000 store locations. So we'd need to model 70M * 5k = 350G goods.
Looks like the 70 million SKUs is an overestimate of what actually planning an economy would take.
An unverifiable source claims Amazon sells over 75 million products but 57.2 million of them are books. A more realistic estimate for distinguishable products would be around 18 million.
My instinct is to actually reduce this number more because of the duplication of efforts capitalism creates (brand name vs offbrand versions of things which are identical, various volumes or amounts of products). Also, many of these products are going to be variations on other products: differently scented soaps, differently colored versions, etc. If anything, 18 million different products for an entire economy is an overestimate imo.
But even if we could say reduce the number by a factor of 2 or 3, we would be increasing it with pseudo products if we used the intermediate good method of making the io matrix sparser for memory management.
If we're going to operate in this world of rough estimates, I would say you'd be using an IO matrix of around 18 million commodities.