Haskellのマルコフ連鎖アルゴリズム(1)

記事が長くて読みにくくなったのでHaskellのマルコフ連鎖アルゴリズム(2)に続く.

http://hackage.haskell.org/package/markov-chain

ソースをダウンロードして,解凍してghicの:lでロード.

Prelude > :l MarkovChain.hs 
[1 of 1] Compiling Data.MarkovChain ( MarkovChain.hs, interpreted )
Ok, modules loaded: Data.MarkovChain.
*Data.MarkovChain Random> 

実行

Data.MarkovChain Random>  take 100 $ run 2 "The sad cat sat on the mat. " 0 (Random.mkStdGen 123)
"The sat on the sat sad cat sat. The sad cat sad cat. The mat. The sad cat sat sat. The sad cat on th"

createMap関数の説明

createMap :: (Ord a) => Int -> [a] -> Map [a] [a]
createMap n x =
   let xc = cycle x
       sufxs   = map (take n) (iterate tail xc)
       imgxs   = map (:[]) (drop n xc)
       mapList = takeMatch x (zip sufxs imgxs)
   in  Map.fromListWith (++) mapList

以下で細かく見ていく.

map (take n) (iterate tail xc)
  • iterate関数:「次に進む」関数と初期値をもらって無限リストを作る.
  • take関数:take n l はリストlの先頭n個の項を要素に持つリストを返す.

cycleで無限リストになったのを入力としているので,ここでは先頭20文字を入力とする.無限リストじゃないのにiterate関数でtailを使っているので,最終的に空リストにtailを適用してエラーとなる.繰り返したリストを入力しているので,語尾と先頭が繋がることに注意(この場合は「.T」).

Prelude> map (take 2) (iterate tail (take 20 (cycle "The sad cat sat.")))
["Th","he","e "," s","sa","ad","d "," c","ca","at","t "," s","sa","at","t.",".T","Th","he","e "," ",""
,"*** Exception: Prelude.tail: empty list
map (:[]) (drop n xc)
  • drop関数:リストの最初のn個を捨てる.
Prelude> map (:[]) (drop 2 "The sad cat sat.")
["e"," ","s","a","d"," ","c","a","t"," ","s","a","t","."]
takeMatch x (zip sufxs imgxs)

x = "The sad cat sat."
xc = cycle x
の場合で考える.

  • zip sufxs imgxs

sufxsとimgxsの両方とも無限リストであることに注意.最後の..は先頭に戻って繰り返しを表す.

[("Th","e"),("he"," "),("e ","s"),(" s","a"),("sa","d"),("ad"," "),("d ","c"),(" c","a"),
 ("ca","t"),("at"," "),("t ","s"),(" s","a"),("sa","t"),("at","."),("t.","T"),(".T","h"),("Th","e"),..]

takeMatch関数に関しては,以下の「takeMatch関数の説明」を参照.
takeMatch関数は,2個引数(両方リスト)をとって,第一引数のリストの要素の数だけ第二引数のリストの要素を先頭から返す関数である.よって,第一引数と第二引数の対応を表にするとわかりやすい(気がする).xとzip sufxs imgxsの各要素の対応を以下の表に示す.

xの各要素 zip sufxs imgxsの各要素
'T' ("Th","e")
'h' ("he"," ")
'e' ("e ","s")
' ' (" s","a")
's' ("sa","d")
'a' ("ad"," ")
'd' ("d ","c")
' ' (" c","a")
'c' ("ca","t")
'a' ("at"," ")
't' ("t ","s")
' ' (" s","a")
's' ("sa","t")
'a' ("at",".")
't' ("t.","T")
'.' (".T","h")

takeMatch x (zip sufxs imgxs)の結果を一応以下に示しておく.

[("Th","e"),("he"," "),("e ","s"),(" s","a"),("sa","d"),("ad"," "),("d ","c"),(" c","a"),
 ("ca","t"),("at"," "),("t ","s"),(" s","a"),("sa","t"),("at","."),("t.","T"),(".T","h")]
Map.fromListWith (++) mapList
  • fromListWith関数: 結合関数(第一引数)でkey/valueのペアからMapを作成する.
Prelude Data.Map> :t fromListWith
fromListWith :: (Ord k) => (a -> a -> a) -> [(k, a)] -> Map k a

Prelude Data.Map> fromListWith (++) [(5,"a"), (5,"b"), (3,"b"), (3,"a"), (5,"a")]
fromList [(3,"ab"),(5,"aba")]

Map.fromListWith (++) mapListの実行結果を一応示す.

fromList [(" c","a"),(" s","aa"),(".T","h"),("Th","e"),("ad"," "),("at",". "),("ca","t"),
("d ","c"),("e ","s"),("he"," "),("sa","td"),("t ","s"),("t.","T")]

takeMatch関数の説明

2個引数(両方リスト)をとって,第一引数のリストの要素の数だけ第二引数のリストの要素を先頭から返す関数.

takeMatch :: [b] -> [a] -> [a]
takeMatch = zipWith (flip const)
  • const関数: 2個引数を取って,第1引数を返す.
  • flip関数: 2引数関数を与えると,引数順をひっくり返した関数を返す.flip f x y = f y x

つまりflip constは,「2個引数を取って第二引数を返す関数」となる.

  • zipWith関数: 2個リストを引数にとって第一引数の関数を各要素に適用する
Prelude> :t zipWith
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]

Prelude> zipWith (+) [1,2,3] [3,2,1] 
[4,4,4]

takeMatchの実行例を以下に示す(zipWithのほうで).

Prelude> zipWith (flip const) [1, 2] [3, 4, 5] 
[3,4]