Information geometry (IG) and optimal transport (OT) are both mathematical frameworks for studying geometric structures on spaces of probability distributions. IG originates from coordinate invariant properties of statistical inference, while OT characterizes the cost-minimizing movement from one distribution to another. Their relationships and applications to data science have gained more and more attention in recent years. This talk begins with an accessible introduction to some core concepts in IG. Then, I will describe a selection of my recent and on-going work which extends the classical framework using tools from OT. Topics include the logarithmic divergence (which extends the Bregman divergence), the λ-exponential family (which extends the exponential family), and the pseudo-Riemannian framework that shows how IG arises from the geometry of OT. A unifying theme is a generalized convex duality based on a logarithmic cost function.