Magic is played with decks of cards, 60 or more, chosen from a pool of legal cards.
Walk Before We Run - ignore sideboard issues
WBWR - make the pool of legal cards small to test the code, add to the pool later.
I'd attack this problem with something like a genetic algorithm. A GA has a simple algorithm:
Create a population, give each member a score.
For as long as you want:
Choose high scoring members, modify them, score the new version, and add them to the population.
Take out members of the population based on some criteria (such as a low score).
Several modifications are usually allowed. Mutation changes a population member slightly. Crossover mixes two members of the population.
Lets make Magic decks our population members. Each member is just a list of card numbers, drawn from the list of legal numbers. Mutation might mean switching a number for another legal number. Crossover might mean choosing a number, N, between 1 and 60, and taking the first N cards from one member, and rest up to 60 from a second member. There are more sophisticated versions of these genetic operators that can be invented, as well.
Since we are constantly testing new decks that are similar to our high scoring members, and throwing away poor decks, the overall population is expected to gradually improve its score. If the poopulation is small, we can also expect it to converge on a single solution, a "killer deck" that is better than all the other decks tested. (Is this a killer deck in real life? No way. It is just good in the metagame that our population represents.) There are also ways to encourage the evolution of 'species' - several different decks that remain in the population for long periods.
Of course, the above description is hiding a big issue behind a "Separation of Concerns" wall. How do we score decks? Well, the best thing to do would be to have a huge round robin tournament, and the score for each deck would be its wins minus its losses. Now you can see why I thought "Self Play" was an important heuristic!
Until you get your game play AI running, you can substitute other scoring methods. Invent some way of assigning points to a deck. If a deck with a creature that requires some color, doesn't contain land of that color, then give that deck a penalty. If the same keyword appears on more cards (so the deck has a 'theme') give the deck a bonus. This kind of quick and dirty scoring is also useful in the beginning of a run of deck development. If our first set of decks is generated by just throwing 60 random numbers together, over and over, those decks are going to be really bad. It might save time to not even bother playing these decks until they reach some basic level of sanity.
I wouldn't overdo the scoring heuristics, though. One of the best things about a GA is that it can find a solution to a problem that a human would never think of. Using a lot of heuristics to score decks, instead of letting them compete head to head, is forcing the algorithm to follow your way of thinking.
So that is my approach in nutshell. Self play involving a variable population which improves over time.