Creating weighted network

Dougie Jones

Join Date: Jul 2018

Posts: 41
#1

Creating weighted network

08 Aug 2018, 05:38

In this dataset, a product is being produced by id1 and moved to id2, and (part of it) looks as follows:

Code:

ssc install nwcommands clear input id1 id2 1 2 1 1 3 4 5 6 3 3 6 4 7 3 6 6 2 3 2 2 4 5 3 9 8 8 7 8 3 4 8 4 4 7 3 2 3 2 9 1 9 9 3 6 5 1 end // setting it as network data and plotting the network nwset id1 id2, edgelist name(test) nwplot test

However, some rows id1 and id2 are identical, meaning that the product was not moved. To paint a realistic picture, is it possible to incorporate the fact that some products are not moved in a network? I.e. that the network only represents a part of all the products. Perhaps via a weighted network, where the weight of the tie is the percentage of products moved from node i to node j relative to the total amount of products node i produced. For that I would first need to know how many products are moved between node i and node j, and how many of node i's product are not being moved (where id1 == node i, and id2 == node i)

I have used this website's search function and looked at the help page for the nwcommands package, but I did not find what I am looking for. Nwcommands does include options to value ties between nodes, but I'm not sure as that is what I am looking for.

Last edited by Dougie Jones; 08 Aug 2018, 05:43.
Tags: None

Dougie Jones

Join Date: Jul 2018
Posts: 41

09 Aug 2018, 01:57

Some progress on my part. I've first decided to drop rows where id1 == id2. Then to count the duplicate rows by id1 and id2, following by categorizing them into several categories.

Code:

ssc install nwcommands
clear
input id1 id2
1 2
1 1
3 4
5 6
3 3
6 4
7 3
6 6
2 3
2 2
4 5
3 9
8 8
7 8
3 4
8 4
4 7
3 2
3 2
9 1
9 9
3 6
5 1
end

drop if id1 == id2


* Counting how often each connection is used (count duplicates), and then assigning values for strength of ties
sort id1 id2
quietly by id1 id2:  gen dup = cond(_N==1,0,_n)
tab dup
gen value = 1
replace value = 2 if dup > 1 // atleast 2 products were moved from node i to node j
replace value = 3 if dup > 2 // atleast 3
replace value = 4 if dup > 3 // atleast 4
drop dup
tab value

nwset id1 id2 value, edgelist name(test) directed

nwtabulate test

I believe that when I tabulate my network with -nwtabulate test-, I can see how often each value occurs. But the result does not look similar to when I tabulate my data with -tab value-, before setting the network.

Does anyone have some pointers or suggestions?

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2404
#3

09 Aug 2018, 08:55

My understanding is that each observation per id1 represents one unit of product, regardless of whether it was moved or not, and that each id1/id2 pair represents a transfer of one unit of product from id1 to id2. On those assumptions, here's a way to do what I think you want. What I have is likely not the shortest or most efficient approach, but I wanted it to be transparent and demonstrate some technique. This gives you an edge list, with weights on each edge, which you can feed to -nwcommands-.

Code:

bysort id1: gen nproduct = _N // amount of production, moved or not drop if (id1 == id2) // product not moved bysort id1 id2: gen nmoved12 = _N if (_n==1) // total moved from id1 to id2 drop if missing(nmoved12) // variable only calculated once per id1/id2 pair gen pctmoved = 100 * nmoved12/nproduct
Comment
Dougie Jones

Join Date: Jul 2018

Posts: 41
#4

10 Aug 2018, 02:53

Thank you, Mike! That was actually very helpful. I decided that I will subset the data prior to feeding it to nwcommands, i.e. selecting the connections above a threshold pctmoved and then creating a network.
Comment

Announcement

Creating weighted network

Comment

Comment

Comment