## Algorithm for isotopy classes of latin squares - latin-square

### How to find key words in paragraph of text?

```I'm trying to find a fast(milliseconds or seconds) solution for having an inputted block of text and a large list(11 million) of specific words/phrases to test against. So I would like to see what words/phrases are in the inputted paragraph?
We use Javascript and have SQL, MongoDB & DynamoDB as existing data stores that we can integrate this solution into.
I've done searching on this problem but can only find checking if words exist in text. not the other way around.
All ideas are welcome!
```
```In cases like these you want to eliminate as much unnecessary data as possible. Assuming that order matters:
First things first, make sure you have a B tree index built on your phrases database clustered on the phrase. This will speed up range lookup times.
Let n = 2 (or 1, if you're into that)
Split the text block into phrases of length n and perform a query for phrases in the dictionary that begin with any of the phrase pairs ('My Phrase%'). This won't perform 4521 million string comparisons thanks to the index.
Remember the phrases that are an exact match
Let n = n + 1
Repeat from step 3 using the reduced dictionary, until the reduced dictionary is empty
You can also make small optimizations here and there depending on the kind of matches you're looking for, such as, not matching across punctuation, only phrases of a certain word length, etc. In any case, I'd expect the time bottleneck here to be on disk access, rather than actual comparisons.
Also, I'm pretty sure I based this algorithm off of an existing one but I don't remember its name so bonus points to anyone who can name it. I think it had something to do with data warehousing/mining and calculating frequencies and patterns?```

### Calculating the similarity between 2 sentences

```I would like to calculate the similarity between 2 sentences and I need the percentage value which says "how good" they match with each other. Sentences like,
1. The red fox is moving on the hill.
2. The black fox is moving in the bill.
I was considering about Levenshtein distance but I am not sure about this because it says it is for finding similarity between "2 words". So can this Levenshtein distancehelp me or what other method can help me? I will be using JavaScript.
```
```Try this solution for JS string diff
```
```Use Jaccard index. You can find implementations in any language, including JavaScript (here is one, didn't test it personally though).
```
```this is what i would do depending on how important this is. if this is medium to low priority here is a simple algo.
scan all sentences and see how often a word occurs.
filter out the most common words like the ones in 30% of sentences , i.e. don't count these. so at the as would hopefully not be counted.
then do your bag of words comparison.
But the context in why you want to do this is really important. i.e. the example you gave us could be for students learning english etc. i.e. theres different algorithms i would use if i was trying to see if crowd sourced users are describing the same paragraph vs if article topics are similar enough for a suggested reading section.
```
`A common Method to compute the similarity of two sentences is to cosine similiarity. Don't know if there an implemenatation in JavaScript exists. The cosine similiarity looks on words and not of single letters. The web is full of explenations for example here.`

### Equations Game. Finding solution for a random Goal from random Resource

```I'm writing an AI for a game called Equations in Javascript.
For the sake of the question, let's pretend that the game is this simple:
There is a Goal, which can be a number(eg 5) or an expression (that can be
evaluated to a number. eg: 2+3).
There are 20 random numbers(1-9) and operators(+-*/) I can use, let's call them
the array resources[]. I need to find one combination of the elements
in resources[] that is evaluated to the Goal, let's call that the
solution (eg 1+6-2+1).
There is no limit of how many numbers or operators I can use, as long
as they are in resource[]. Once they are used, they cannot be used
again. So the longest solution might be 20 symbols long.
Is there a way I can quickly find such solution? The AI might need to evaluate this many times when analysing a move's score.
Thanks guys```

### \$S_{10}\$, the symmetric group and GAP

```Here is a question. As you see, the problem's established on finding an element of a certain order in \$S_{10}\$. I tried to do this question by using GAP. But, GAP couldn't handle the symmetric group \$S_{10}\$. What can we do in this situation? Is there a way for defining this large finite group for GAP? Thanks for your time.
```
```I'd be really interested to see what you've tried. Anyhow, and in particular, for the symmetric group where conjugacy classes are well known, you can easily examine orders of their representatives:
gap> 15 in List(ConjugacyClasses(SymmetricGroup(10)),c->Order(Representative(c)));
true
Moreover, you can use the character table library without even constructing the group itself:
gap> 15 in OrdersClassRepresentatives(CharacterTable("S10"));
true```

### Objective-C jumbled letters solver

```I am trying to create this app on the iphone that given 6 letters, it would output all the possible 3-6 letter english words. I already have a dictionary, I just want to know how to do it.
I searched around and only found those scrabble solvers in python or those word search grid solutions.
I think a brute force search would do, but I'm concerned about the performance. Code is not necessary, a link to an algorithm or the algorithm itself would be fine, I think I'll be able to manage once I get that.
Thanks!
```
```If you're concerned about performance, this method might do the trick. It involves some preprocessing but would allow for nearly-instantaneous lookup for anagrams.
Create a data structure that maps a String key to a List of Strings (I'm more familiar with Java, so in that case it would be a Map<String,List<String>>) This will store your dictionary.
Define a function that takes a string and outputs the same letters arranged alphabetically. For example, hello would become ehllo; kitchen would become cehiknt. I'll refer to this function as keyify(word)
Here's the preprocessing part: for each item in your dictionary, find the list for that item's key (keyify(item)) and add that item to the list.
When it comes time to look up anagrams of a given word, just look up the list at they keyify of that word. For example, if the input was kitchen, the keyify would be cehiknt, and looking that up in your map should result in a list containing kitchen, chicken and whatever other anagrams of kitchen that I forgot :P
```
```Check out this answer: Algorithm to generate anagrams.. Look at the answer by Jason Cohen. Alphabetize the 6 letter word, then run through your dictionary and alphabatize that word and compare.
```
```I actually ran into this issue a few weeks back and the most efficient way I could figure out how to solve it was
I found all the subsets of a given string (this will take O(2^n) )
Then I looked at my dictionary to see if the subset "used up" all the characters of all the strings of that size
for example given the string "hetre"
and the words "the, there, her" are in your dictionary
you can calculate all subsets
{h}{e}{t}{r}{e}{he}{ht}{hr}{he}{het}{her}{reh}... there are 32 subsets of "hetre"
then check to see if any of these subsets are similar to the words in the dictionary in this case
reh is similar to her which means that her is a word to be used
This was the most efficient way I could think of
research PowerSets and think of a way you could write the function that "uses up" the strings
Another way would be to brute force it by figuring out the powersets for the strings and finding all permutations, this will destroy performance
Mine didn't give me problems till I started entering strings over 15 characters using the first method
using the second method I didn't get problems till 7```