Constructing transition probability matrix

William 08/07/2018. 3 answers, 179 views
list-manipulation matrix markov-chains markov-process

I have the following list:

x={"A", "A", "A", "E", "D", "D", "D", "C", "B", "E", "E", "E", "D", \
"B", "A", "D", "B", "E", "C", "A", "D", "A", "A", "A", "A", "C", "C", \
"C", "D", "D", "E"}

I want to make the Markov transition probability matrix of first order. To do so I have started by writing:

Partition[x, 2, 1] // Sort // Counts

This will give:

<|{"A", "A"} -> 5, {"A", "C"} -> 1, {"A", "D"} -> 2, {"A", "E"} -> 
  1, {"B", "A"} -> 1, {"B", "E"} -> 2, {"C", "A"} -> 1, {"C", "B"} -> 
  1, {"C", "C"} -> 2, {"C", "D"} -> 1, {"D", "A"} -> 1, {"D", "B"} -> 
  2, {"D", "C"} -> 1, {"D", "D"} -> 3, {"D", "E"} -> 1, {"E", "C"} -> 
  1, {"E", "D"} -> 2, {"E", "E"} -> 2|>

above shows the frequencies of state transition A to A, A to B, A to C, A to D and A to E and so on for other letters, I wonder how can I show this result as a matrix?

3 Answers


Henrik Schumacher 08/07/2018.

You can use SparseArray with additive assembly as follows:

x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];
data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
A = With[{
 spopt = SystemOptions["SparseArrayOptions"]},
  Internal`WithLocalSettings[
   (*switch to additive assembly*)
   SetSystemOptions["SparseArrayOptions" -> {"TreatRepeatedEntries" -> Total}],

   (*assemble matrix*)
   SparseArray[
    Partition[data, 2, 1] -> 1, 
    Max[data] {1, 1}
    ]

   , 
   (*reset "SparseArrayOptions" to previous value*)
   SetSystemOptions[spopt]]
  ]; // AbsoluteTiming // First

0.739454

0.114682

Remarks

  • As the timings suggest, it is worthwhile to avoid strings in the first place.

  • Formerly, I used LetterNumber, but ToCharacterCode is much, much faster.

  • It is "TreatRepeatedEntries" -> Total which enables summing of entries. Count is not needed anymore. Developer`ToPackedArray might speed up things a bit if x is very long. The other hokus-pokus is for making things bulletproof against aborts (i.e., options are reset even if computations are interrupted). See also (37566) and (136017).


Anton Antonov 08/08/2018.

You can use the package CrossTabulate.m. (More detailed references are given in this MSE answer.)

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/CrossTabulate.m"]

cmat = CrossTabulate[Partition[x, 2, 1]];

MatrixForm[cmat]

enter image description here

cmat["SparseMatrix"] = cmat["SparseMatrix"]/Total[cmat["SparseMatrix"], {2}];
MatrixForm[cmat]

enter image description here

ArrayRules[cmat["SparseMatrix"]]

(* {{1, 1} -> 5/9, {1, 5} -> 1/9, {1, 4} -> 2/9, {1, 3} -> 1/
  9, {2, 5} -> 2/3, {2, 1} -> 1/3, {3, 2} -> 1/5, {3, 1} -> 1/
  5, {3, 3} -> 2/5, {3, 4} -> 1/5, {4, 4} -> 3/8, {4, 3} -> 1/
  8, {4, 2} -> 1/4, {4, 1} -> 1/8, {4, 5} -> 1/8, {5, 4} -> 2/
  5, {5, 5} -> 2/5, {5, 3} -> 1/5, {_, _} -> 0} *)

kglr 08/08/2018.

You can also use EstimatedProcess and MarkovProcessProperties as follows:

states = DeleteDuplicates[x];
ordering = Ordering[states]; 
data = ArrayComponents @ x ;
estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
tm = MarkovProcessProperties[estproc, "TransitionMatrix"][[ordering, ordering]] 

TeXForm[TableForm[tm, TableHeadings -> {states[[ordering]], states[[ordering]]}]]

$\begin{array}{cccccc} & \text{A} & \text{B} & \text{C} & \text{D} & \text{E} \\ \text{A} & \frac{5}{9} & 0 & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} \\ \text{B} & \frac{1}{3} & 0 & 0 & 0 & \frac{2}{3} \\ \text{C} & \frac{1}{5} & \frac{1}{5} & \frac{2}{5} & \frac{1}{5} & 0 \\ \text{D} & \frac{1}{8} & \frac{1}{4} & \frac{1}{8} & \frac{3}{8} & \frac{1}{8} \\ \text{E} & 0 & 0 & \frac{1}{5} & \frac{2}{5} & \frac{2}{5} \\ \end{array}$


HighResolutionMusic.com - Download Hi-Res Songs

1 (G)I-DLE

POP/STARS flac

(G)I-DLE. 2018. Writer: Riot Music Team;Harloe.
2 The Chainsmokers

Beach House flac

The Chainsmokers. 2018. Writer: Andrew Taggart.
3 Ariana Grande

​Thank U, Next flac

Ariana Grande. 2018. Writer: Crazy Mike;Scootie;Victoria Monét;Tayla Parx;TBHits;Ariana Grande.
4 Nicki Minaj

No Candle No Light flac

Nicki Minaj. 2018. Writer: Denisia “Blu June” Andrews;Kathryn Ostenberg;Brittany "Chi" Coney;Brian Lee;TJ Routon;Tushar Apte;ZAYN;Nicki Minaj.
5 Clean Bandit

Baby flac

Clean Bandit. 2018. Writer: Jack Patterson;Kamille;Jason Evigan;Matthew Knott;Marina;Luis Fonsi.
6 Imagine Dragons

Bad Liar flac

Imagine Dragons. 2018. Writer: Jorgen Odegard;Daniel Platzman;Ben McKee;Wayne Sermon;Aja Volkman;Dan Reynolds.
7 Halsey

Without Me flac

Halsey. 2018. Writer: Halsey;Delacey;Louis Bell;Amy Allen;Justin Timberlake;Timbaland;Scott Storch.
8 BTS

Waste It On Me flac

BTS. 2018. Writer: Steve Aoki;Jeff Halavacs;Ryan Ogren;Michael Gazzo;Nate Cyphert;Sean Foreman;RM.
9 BlackPink

Kiss And Make Up flac

BlackPink. 2018. Writer: Soke;Kny Factory;Billboard;Chelcee Grimes;Teddy Park;Marc Vincent;Dua Lipa.
10 Fitz And The Tantrums

HandClap flac

Fitz And The Tantrums. 2017. Writer: Fitz And The Tantrums;Eric Frederic;Sam Hollander.
11 Backstreet Boys

Chances flac

Backstreet Boys. 2018.
12 Kelly Clarkson

Never Enough flac

Kelly Clarkson. 2018. Writer: Benj Pasek;Justin Paul.
13 Diplo

Close To Me flac

Diplo. 2018. Writer: Ellie Goulding;Savan Kotecha;Peter Svensson;Ilya;Swae Lee;Diplo.
14 Anne-Marie

Rewrite The Stars flac

Anne-Marie. 2018. Writer: Benj Pasek;Justin Paul.
15 Little Mix

Woman Like Me flac

Little Mix. 2018. Writer: Nicki Minaj;Steve Mac;Ed Sheeran;Jess Glynne.
16 Imagine Dragons

Machine flac

Imagine Dragons. 2018. Writer: Wayne Sermon;Daniel Platzman;Dan Reynolds;Ben McKee;Alex Da Kid.
17 Little Mix

The Cure flac

Little Mix. 2018.
18 Bradley Cooper

Always Remember Us This Way flac

Bradley Cooper. 2018. Writer: Lady Gaga;Dave Cobb.
19 Rita Ora

Velvet Rope flac

Rita Ora. 2018.
20 Lady Gaga

I'll Never Love Again flac

Lady Gaga. 2018. Writer: Benjamin Rice;Lady Gaga.

Related questions

Hot questions

Language

Popular Tags